# Rememberizer LLM Ready Documentation
*Generated at 2025-09-25 20:09:25 PDT. Available as raw content at [Rememberizer llms-full.txt](https://llm.rememberizer.ai/llms-full.txt).*
This document provides a comprehensive, consolidated reference of Rememberizer's documentation, optimized for large language model (LLM) consumption. It combines various documentation sources into a single, easily accessible format, facilitating efficient information retrieval and processing by AI systems.
```
==> SUMMARY.md <==
# Table of contents
* [Why Rememberizer?](README.md)
* [Background](background/README.md)
* [What are Vector Embeddings and Vector Databases?](background/what-are-vector-embeddings-and-vector-databases.md)
* [Glossary](background/glossary.md)
* [Standardized Terminology](background/standardized-terminology.md)
## Personal Use
* [Getting Started](personal/README.md)
* [Search your knowledge](personal/search-your-knowledge.md)
* [Mementos Filter Access](personal/mementos-filter-access.md)
* [Common knowledge](personal/common-knowledge.md)
* [Manage your embedded knowledge](personal/manage-your-embedded-knowledge.md)
* [Integrations](personal/integrations.md)
* [Rememberizer App](personal/rememberizer-app.md)
* [Rememberizer Slack integration](personal/rememberizer-slack-integration.md)
* [Rememberizer Google Drive integration](personal/rememberizer-google-drive-integration.md)
* [Rememberizer Dropbox integration](personal/rememberizer-dropbox-integration.md)
* [Rememberizer Gmail integration](personal/rememberizer-gmail-integration.md)
* [Rememberizer Memory integration](personal/rememberizer-memory-integration.md)
* [Rememberizer MCP Servers](personal/rememberizer-mcp-servers.md)
* [Manage third-party apps](personal/manage-third-party-apps.md)
## Developer Resources
* [Developer Overview](developer/README.md)
* [Integration Options](developer/integration-options.md)
* [Registering and using API Keys](developer/registering-and-using-api-keys.md)
* [Registering Rememberizer apps](developer/registering-rememberizer-apps.md)
* [Authorizing Rememberizer apps](developer/authorizing-rememberizer-apps.md)
* [Creating a Rememberizer GPT](developer/creating-a-rememberizer-gpt.md)
* [LangChain integration](developer/langchain-integration.md)
* [Vector Stores](developer/vector-stores.md)
* [Talk-to-Slack the Sample Web App](developer/talk-to-slack-the-sample-web-app.md)
* [Enterprise Integration](developer/enterprise-integration.md)
* [Enterprise Integration Patterns](developer/enterprise-integration-patterns.md)
* [API Reference](developer/api-docs/README.md)
* [Authentication](developer/api-docs/authentication.md)
* [Core APIs](developer/api-docs/README.md)
* [Account & Configuration](developer/api-docs/README.md)
## Additional Resources
* [Notices](notices/README.md)
* [Terms of Use](notices/terms-of-use.md)
* [Privacy Policy](notices/privacy-policy.md)
* [B2B](notices/b2b/README.md)
* [About Reddit Agent](notices/b2b/about-reddit-agent.md)
* [Releases](releases/README.md)
* [2025 Releases](additional-resources/releases/2025-releases/README.md)
* [Sep 26th, 2025](releases/sep-26th-2025.md)
* [Aug 22nd, 2025](releases/aug-22nd-2025.md)
* [Aug 8th, 2025](releases/aug-8th-2025.md)
* [Aug 1st, 2025](releases/aug-1st-2025.md)
* [Jul 25th, 2025](releases/jul-25th-2025.md)
* [Jul 18th, 2025](releases/jul-18th-2025.md)
* [Jul 11th, 2025](releases/jul-11th-2025.md)
* [Jul 4th, 2025](releases/jul-4th-2025.md)
* [Jun 27th, 2025](releases/jun-27th-2025.md)
* [Jun 20th, 2025](releases/jun-20th-2025.md)
* [Jun 6th, 2025](releases/jun-6th-2025.md)
* [May 30th, 2025](releases/may-30th-2025.md)
* [May 23rd, 2025](releases/may-23rd-2025.md)
* [Apr 25th, 2025](releases/apr-25th-2025.md)
* [Apr 18th, 2025](releases/apr-18th-2025.md)
* [Apr 11th, 2025](releases/apr-11th-2025.md)
* [Apr 4th, 2025](releases/apr-4th-2025.md)
* [Mar 28th, 2025](releases/mar-28th-2025.md)
* [Mar 21st, 2025](releases/mar-21st-2025.md)
* [Mar 14th, 2025](releases/mar-14th-2025.md)
* [Jan 17th, 2025](releases/jan-17th-2025.md)
* [2024 Releases](additional-resources/releases/2024-releases/README.md)
* [December 2024](additional-resources/releases/2024-releases/december-2024.md)
* [Dec 27th, 2024](releases/dec-27th-2024.md)
* [Dec 20th, 2024](releases/dec-20th-2024.md)
* [Dec 13th, 2024](releases/dec-13th-2024.md)
* [Dec 6th, 2024](releases/dec-6th-2024.md)
* [Nov 29th, 2024](releases/nov-29th-2024.md)
* [Nov 22nd, 2024](releases/nov-22nd-2024.md)
* [Nov 15th, 2024](releases/nov-15th-2024.md)
* [Nov 8th, 2024](releases/nov-8th-2024.md)
* [Nov 1st, 2024](releases/nov-1st-2024.md)
* [Oct 25th, 2024](releases/oct-25th-2024.md)
* [Oct 18th, 2024](releases/oct-18th-2024.md)
* [Oct 11th, 2024](releases/oct-11th-2024.md)
* [Oct 4th, 2024](releases/oct-4th-2024.md)
* [Sep 27th, 2024](releases/sep-27th-2024.md)
* [Sep 20th, 2024](releases/sep-20th-2024.md)
* [Sep 13th, 2024](releases/sep-13th-2024.md)
* [Aug 16th, 2024](releases/aug-16th-2024.md)
* [Aug 9th, 2024](releases/aug-9th-2024.md)
* [Aug 2nd, 2024](releases/aug-2nd-2024.md)
* [Jul 26th, 2024](releases/jul-26th-2024.md)
* [Jul 12th, 2024](releases/jul-12th-2024.md)
* [Jun 28th, 2024](releases/jun-28th-2024.md)
* [Jun 14th, 2024](releases/jun-14th-2024.md)
* [May 31st, 2024](releases/may-31st-2024.md)
* [May 17th, 2024](releases/may-17th-2024.md)
* [May 10th, 2024](releases/may-10th-2024.md)
* [Apr 26th, 2024](releases/apr-26th-2024.md)
* [Apr 19th, 2024](releases/apr-19th-2024.md)
* [Apr 12th, 2024](releases/apr-12th-2024.md)
* [Apr 5th, 2024](releases/apr-5th-2024.md)
* [Mar 25th, 2024](releases/mar-25th-2024.md)
* [Mar 18th, 2024](releases/mar-18th-2024.md)
* [Mar 11th, 2024](releases/mar-11th-2024.md)
* [Mar 4th, 2024](releases/mar-4th-2024.md)
* [Feb 26th, 2024](releases/feb-26th-2024.md)
* [Feb 19th, 2024](releases/feb-19th-2024.md)
* [Feb 12th, 2024](releases/feb-12th-2024.md)
* [Feb 5th, 2024](releases/feb-5th-2024.md)
* [Jan 29th, 2024](releases/jan-29th-2024.md)
* [Jan 22nd, 2024](releases/jan-22nd-2024.md)
* [Jan 15th, 2024](releases/jan-15th-2024.md)
* [Rememberizer LLM Ready Documentation](rememberizer-llm-ready-documentation.md)
==> README.md <==
---
description: Introduction
---
# Why Rememberizer?
Generative AI apps work better when they have access to background information. They need to know what you know. A great way to achieve that is to give them access to relevant content from the documents, data and discussions you create and use. That is what Rememberizer does.
==> personal/rememberizer-slack-integration.md <==
---
description: >-
This guide will walk you through the process of integrating your Slack
workspace into Rememberizer as a knowledge source.
type: guide
last_updated: 2025-04-03
---
# Rememberizer Slack Integration
## Overview
The Slack integration allows you to connect your Slack workspace to Rememberizer, enabling AI applications to search and reference your team's Slack messages and shared files. This integration creates a searchable knowledge base from your conversations, announcements, questions, and decisions captured in Slack.
## Before You Begin
Before connecting Slack to Rememberizer, ensure you:
- Have a Rememberizer account
- Have access to a Slack workspace where you have permission to install apps
- Understand which channels contain knowledge you want to make searchable
- Consider any organizational data policies regarding third-party integrations
## Connection Process
### Step 1: Access the Knowledge Sources
1. Sign in to your Rememberizer account
2. Navigate to **Personal > Your Knowledge** tab, or visit [https://rememberizer.ai/personal/knowledge](https://rememberizer.ai/personal/knowledge)
3. You should see all available knowledge sources, including Slack
Your Knowledge sources page with Slack option
### Step 2: Initiate Slack Connection
1. Click the **"Connect"** button on the Slack knowledge source card
2. You will be redirected to Slack's authorization page
3. Select the Slack workspace you want to connect (if you belong to multiple workspaces)
Slack OAuth authorization screen
> **Note:** If you see a warning that this application is not authorized by Slack, it is because Rememberizer is intended to search for Slack content outside of Slack, which is against the [Slack App Directory Guidelines](https://api.slack.com/directory/guidelines). This doesn't affect the functionality or security of the integration.
### Step 3: Grant Permissions
1. Review the permissions Rememberizer is requesting:
- Read access to public channels
- Read access to private channels you're a member of
- Read access to message history
- Read access to files
2. Click **"Allow"** to install the Rememberizer Slack app to your workspace
### Step 4: Select Channels for Indexing
1. After successful authorization, you'll be redirected back to Rememberizer
2. A side panel will automatically open showing available channels
3. If the panel doesn't appear, click the **"Select"** button next to your Slack workspace
Successful Slack workspace connection
### Step 5: Choose Specific Channels
1. In the side panel, browse the list of available channels
2. Select checkboxes next to channels you want to include
3. You can filter by channel type (public/private) or search by name
4. Consider starting with just a few channels for faster initial processing
Select specific Slack channels to index
### Step 6: Begin Processing
1. After selecting channels, click **"Save"** at the bottom of the panel
2. Rememberizer will begin downloading, processing, and embedding messages
3. You'll see a progress indicator as channels are processed
4. Initial processing may take several minutes to hours depending on the volume of messages
## How Slack Data is Processed
When you connect Slack to Rememberizer, the following occurs:
1. **Authentication**: Secure OAuth connection established with refresh token capability
2. **Channel Selection**: Only your selected channels are accessed
3. **Message Retrieval**: Messages and threaded replies are downloaded in batches
4. **Content Processing**:
- Messages are chunked into appropriate segments
- Vector embeddings are generated to capture semantic meaning
- Files shared in messages are processed based on supported formats
5. **Continuous Updates**: New messages are periodically synced (approximately every 6 hours)
## Data Refresh and Synchronization
Rememberizer automatically keeps your Slack knowledge up to date:
- **Incremental Updates**: Only new or changed messages are processed after initial indexing
- **Update Schedule**: Automatic synchronization occurs approximately every 6 hours
- **Thread Monitoring**: New replies in threads are detected and indexed
- **Manual Refresh**: Force an immediate update by clicking the "Refresh" icon next to your Slack connection
## Security and Privacy Considerations
The Slack integration includes several security measures:
- **OAuth Security**: Industry-standard authorization protocol with token encryption
- **Selective Access**: Only processes channels you explicitly select
- **Encrypted Storage**: All message content is encrypted before storage
- **No Message Alteration**: Read-only access to your Slack workspace (cannot post or modify messages)
- **Permission Scopes**: Limited to only the permissions needed for search functionality
- **Account Linking**: Connection is specific to your Rememberizer account only
## Troubleshooting Common Issues
### Authorization Failures
**Problem**: Unable to connect to Slack or authorization errors.
**Solutions**:
- Ensure you have permissions to install apps in your workspace
- Try disconnecting and reconnecting the integration
- Check if your organization uses Slack Enterprise Grid with app restrictions
### Missing Channels
**Problem**: Some channels don't appear in the selection panel.
**Solutions**:
- Verify you're a member of the channels you want to index
- For private channels, you must be a member to index them
- Refresh the channel list by clicking the refresh icon
### Processing Delays
**Problem**: Indexing is taking a very long time.
**Solutions**:
- Start with fewer channels and add more gradually
- Check if channels have an extremely large message history
- Verify your internet connection is stable
### Authentication Expiration
**Problem**: Integration stops working after some time.
**Solutions**:
- Reconnect the integration through the Knowledge page
- Check if the Slack app was removed from your workspace
- Verify your Rememberizer account is active
## Limitations and Considerations
- **Message History**: Up to 100,000 messages per channel can be indexed
- **File Types**: Supported files include PDFs, text documents, and spreadsheets
- **Private Channels**: Only private channels you're a member of can be indexed
- **Direct Messages**: DMs are not currently supported for privacy reasons
- **Enterprise Restrictions**: Some Slack Enterprise Grid features may affect integration
## What's Next?
After connecting Slack to Rememberizer:
1. Use [Mementos](mementos-filter-access.md) to control which AI tools can access your Slack knowledge
2. Combine Slack with other knowledge sources for comprehensive context
3. Try [searching your knowledge](https://rememberizer.ai/personal/search) through the web UI
4. Connect your knowledge to AI tools using GPT integration or the Rememberizer API
If you encounter any issues during setup or use, contact our support team for assistance.
==> personal/manage-third-party-apps.md <==
# Manage third-party apps
## Explore third-party apps and service
User can view and explore all third-party apps that connect with Rememberizer in **App directory** page with the below instructions.
* On the navigation bar, choose **Personal > Find an App**. Then, you will see the App directory page.
Navigation bar browsing App Directory page
App directory page
* Find the app you want to explore. You can do this by type the name of the app in search bar with optional **filter and sorting order.**
Search bar with filter and sort order button
* Click on the **name of the third-party app** or **Explore button** to open the app.
App's name and Explore button
* When using the app, it will requires authorizing the apps with Rememberizer. Technical details of the flow can be visited at [authorizing-rememberizer-apps.md](../developer/authorizing-rememberizer-apps.md "mention") page. We will use **Rememberizer GPT app** as an example of the UI flows of authorization. After the first chat, you will see the app ask to sign in the Rememberizer.
Sign in request from Rememberizer GPT app
* Click on the **Sign in** button. You will be redirected to the Authorization page.
Authoriztion page
* You can modify the Memento and Memory that the app can view and use by click on the **Change** button and select what you want.
> **Note:** Detail information about Memento, please visit [mementos-filter-access.md](mementos-filter-access.md "mention") page.
> **Note:** Detail information about Memory integration, please visit [rememberizer-memory-integration.md](rememberizer-memory-integration.md "mention") page.
* Click **Authorize** to complete the process. You then will be directed back to the app again and you can chat with it normally.
> **Note:** In case you click **Cancel** button, you will be directed to the app landing page again and the app will not be displayed in the **App directory** page but will instead be on **Your connected apps** page. More detail information please visit second part [#manage-your-connected-apps](manage-third-party-apps.md#manage-your-connected-apps "mention") if you want to completely cancel the authorization process.
Success connected account
## Manage your connected apps
On the **App directory** page, choose **Your connected apps** to browse the page.
Your connected apps page
This page categorizes apps into two types based on their status: **Pending Apps** and **Connected Apps**.
* **Pending Apps**: These are the apps which you click **Cancel** button while authorizing the the app on Rememberizer.
* Click **Continue** if you want to complete the authorization process.
* Otherwise, click **Cancel** to completely withdraw the authorization. The app will then be displayed in **App Directory** page again.
* **Connected Apps:** You can config the **Memento** or **Memory integration** of specific connected app by click on the Change option (or Select if the Memento has not been chosen). Click **Disconnect** if you want to disconnect the third-party app from the Rememberizer.
==> personal/rememberizer-memory-integration.md <==
# Rememberizer Memory integration
### Introduction
Rememberizer Memory allows third party apps to store and access data in a user's Rememberizer account, providing a simple way for valuable information to be saved and utilize across multiple user' applications.
### Benefits
#### For User
Shared Memory creates a single place where key results and information from all the user's apps are available in one location. Some benefits for user include:
* Easy Access: Important data is centralized, allowing both the user and their apps to easily access results from multiple apps in one place.
* Sync Between Apps: Information can be shared and synced between a user's different apps seamlessly without extra effort from the user.
* Persistent Storage: Data remains accessible even if individual apps are uninstalled, unlike app-specific local storage.
#### For App Developers
The Shared Memory provides app developers a simple way to access data from a user's other connected apps:
* No Backend Needed: Apps do not need to develop their own custom backend systems to store and share data.
* Leverage Other Apps: Apps can build on and utilize public data generated by a user's other installed apps, enriching their own functionality.
* Cross-App Integration: Seamless integration and data sharing capabilities are enabled between an app developer's different apps.
By default all apps have read-only access to Shared Memory, while each app can write only to its own memory space. The user has controls to customize access permissions as needed. This balances data sharing with user privacy and control.
### Config Your Memory
#### Global Settings
The Global Settings allow user to configure the default permissions for all apps using Shared Memory. This includes:
Config Memory in Knowledge Page
#### Default Memory and Data Access Permissions for Apps
* **Read Own/Write Own:** Apps are exclusively permitted to access and modify their own memory data.
* **Read All/Write Own:** Apps can read memory data across all apps but are restricted to modifying only their own memory data.
* **Disable Memory:** By default, apps cannot access or store memory data.
* **Apply to All Option**: User can apply all app-specific permission settings back to the defaults chosen in Global Settings.
User can clear all Memory documents with _**Forget your memory**_ option:
Confirmation Modal when Forget Memory
#### App Settings
For each connected app, user can customize the Shared Memory permissions. Click on the **"Find an App"**, then click **"Your connected apps"** or go to the link [https://rememberizer.ai/personal/apps/connected](https://rememberizer.ai/personal/apps/connected) to see the list of your connected apps. Then, click **"Change"** on the Memory of the app you want to custom:
Config Memory for each App in Connected Apps Page
#### Memory Access Permissions for Apps
* **Read Own/Write Own**: Permissions allow the app to only access and modify its own memory data, preventing it from interacting with other apps' memory.
* **Read All/Write Own**: The app can view memory data from all apps but is restricted to modifying only its own memory data.
* **Disable Memory**: The app is prohibited from accessing or modifying memory data.
This allows user fine-grained control over how each app can utilize Shared Memory based on the user's trust in that specific app. Permissions for individual apps can be more restrictive than the global defaults.
Together, the Global and App Settings give user powerful yet easy-to-use controls over how their data is shared through Shared Memory.
### Integrate with Memory Feature
#### API Endpoint
Rememberizer expose an API end point [/**api/v1/documents/memorize/**](https://docs.rememberizer.ai/developer/api-docs/memorize-content-to-rememberizer) to let GPT App call to memorize the content.
Note: This api is available for Memory with [3rd-party apps with OAuth2 authentication](../developer/authorizing-rememberizer-apps.md) only (not [API-key](../developer/registering-and-using-api-keys.md) yet).
#### Memorize your knowledge
After authorizing with Rememberizer, the third party app can memorize it valuable knowledge.
Here, we will demonstrate a process using Remembeizer GPT App.
* After using Rememberizer GPT App, user want to memorize the third point "Zero-Cost Abstractions".
Interacting with Rememberizer GPT Apps
* To use the Rememberizer App's Memory feature, user must first authorize the app to access your project. Use the **memorize** command to tell the app what knowledge it needs to store.
Sign In to authorize Rememberizer
* User can Config the Memory Option here, with the default value is based on the Global Config
Authorizing Screen
The Rememberizer now successfully memorizes knowledge.
* In Rememberizer, user can see the recent content at **Embed Knowledge Details** page.
With the **Talk to Slack** app, user can seamlessly apply and continue their progress using the data they have committed to memory. For example, memorized information they can easily query and retrieve
Recall Memory Data in another app
### Using Memory Data via Memento
* Another way to utilize the memory data is by creating **Memento** and refine the Memory into it. Visit [Memento Feature](mementos-filter-access.md#how-to-create-a-mementos) section for further information about creation instruction.
* Rememberizer saves content into files and user can choose any app to refine its content into **Memento**.
> Note: In older version, Rememberizer saves content into files and combine into folder for each date.
With the [Memento Feature](https://docs.rememberizer.ai/personal/mementos-filter-access#what-is-a-memento-and-why-do-they-exist), user can utilize the Memory data even when the Memory App Config is Off.
### Search Memory document in Rememberizer
You can also [Search Your Knowledge](https://rememberizer.ai/personal/search) through our web UI, or better, use this knowledge in an LLM through our GPT app or our public API.
==> personal/rememberizer-dropbox-integration.md <==
---
description: >-
This guide will walk you through the process of integrating your Dropbox into
Rememberizer as a knowledge source.
type: guide
last_updated: 2025-04-03
---
# Rememberizer Dropbox Integration
## Overview
The Dropbox integration allows you to connect your Dropbox files and folders to Rememberizer, creating a searchable knowledge base from your documents, presentations, spreadsheets, and other content. This integration enables AI applications to reference your Dropbox content when answering questions or generating insights.
## Before You Begin
Before connecting Dropbox to Rememberizer, ensure you:
- Have a Rememberizer account
- Have a Dropbox account with content you want to make searchable
- Understand which files and folders contain valuable knowledge
- Consider any organizational or personal data policies
## Connection Process
### Step 1: Access Knowledge Sources
1. Sign in to your Rememberizer account
2. Navigate to **Personal > Your Knowledge** tab, or visit [https://rememberizer.ai/personal/knowledge](https://rememberizer.ai/personal/knowledge)
3. Locate the Dropbox card in the available knowledge sources
Dropbox knowledge source card on the Knowledge page
### Step 2: Initiate Dropbox Connection
1. Click the **"Connect"** button on the Dropbox knowledge source card
2. You will be redirected to the Dropbox authorization page
3. If you're not already logged in to Dropbox, enter your credentials
4. Review the permissions Rememberizer is requesting
Dropbox permission request screen
### Step 3: Grant Permissions
1. Click **"Allow"** to authorize Rememberizer to access your Dropbox files
2. This grants Rememberizer read-only access to your files and folders
### Step 4: Return to Rememberizer
1. After successful authorization, you'll be redirected back to Rememberizer
2. The platform will display a connection confirmation
3. A file selection panel will automatically open
Successful Dropbox connection with file selection panel
### Step 5: Select Files and Folders
1. In the side panel, browse your Dropbox folder structure
2. Select specific files or entire folders by checking the boxes
3. Navigate through folders using the breadcrumb navigation
4. If the side panel doesn't appear, click the **"Select"** button next to your Dropbox connection
Select files and folders to process
### Step 6: Begin Processing
1. After selecting your files and folders, click **"Add"**
2. Rememberizer will begin downloading, processing, and embedding your files
3. You'll see a progress indicator as files are processed
4. Initial processing may take several minutes to hours depending on the amount of data
## How Dropbox Data is Processed
When you connect Dropbox to Rememberizer, the following occurs:
1. **Authentication**: Secure OAuth connection established with Dropbox
2. **File Selection**: Only your selected files and folders are accessed
3. **Content Extraction**: Text is extracted from compatible file formats
4. **Content Processing**:
- Documents are chunked into appropriate segments
- Vector embeddings are generated for each chunk
- Metadata such as file name, path, and type is preserved
5. **Continuous Updates**: Files are monitored for changes through the Dropbox API
## Supported File Types
The Dropbox integration supports various file formats, including:
| Category | Supported Formats |
|----------|-------------------|
| Text Documents | .txt, .md, .rtf, .csv |
| Office Documents | .docx, .doc, .xlsx, .xls, .pptx, .ppt |
| PDF Documents | .pdf |
| Code Files | .py, .js, .java, .html, .css, and more |
| Data Files | .json, .xml, .yaml, .csv |
## Data Refresh and Synchronization
Rememberizer automatically keeps your Dropbox knowledge up to date:
- **Change Detection**: Uses Dropbox's cursor-based synchronization to detect file changes
- **Update Schedule**: Automatic synchronization occurs approximately every 6 hours
- **Selective Updates**: Only changed files are reprocessed, not your entire Dropbox
- **Manual Refresh**: Force an immediate update by clicking the "Refresh" icon next to your Dropbox connection
- **New Files in Selected Folders**: If you select a folder, any new files added to that folder will be automatically detected and processed
## Security and Privacy Considerations
The Dropbox integration includes several security measures:
- **OAuth Security**: Industry-standard authorization with secure token management
- **Selective Access**: Only processes files you explicitly select
- **Encrypted Storage**: All document content is encrypted before storage
- **Read-Only Access**: Cannot modify or delete your Dropbox files
- **Permission Revocation**: You can revoke access at any time through Dropbox settings
- **Data Handling**: Files are processed for vector embedding creation and original content is not permanently stored
## Troubleshooting Common Issues
### Authentication Problems
**Problem**: Unable to connect to Dropbox or authorization errors.
**Solutions**:
- Ensure you're using the correct Dropbox account
- Try clearing browser cookies and cache
- Verify you don't have browser extensions blocking OAuth redirects
- Check if Dropbox is accessible directly through your browser
### Missing Files or Folders
**Problem**: Some files or folders don't appear in the selection panel.
**Solutions**:
- Verify the files exist in your Dropbox account
- Try refreshing the file browser
- Check if files are in a shared folder with limited permissions
- Ensure files are synced to Dropbox (not just on your computer)
### Processing Errors
**Problem**: Files fail to process or show error status.
**Solutions**:
- Check if the file format is supported
- Verify the file isn't corrupted or password-protected
- For large files, allow more time for processing
- Try removing and re-adding problematic files
### Synchronization Issues
**Problem**: Updated Dropbox content isn't reflected in searches.
**Solutions**:
- Check when the last sync occurred (visible in the connection details)
- Manually trigger a refresh of your Dropbox connection
- Verify that the file was changed at least 6 hours ago (time for auto-sync)
- Check if the file still exists in the original location
## Managing Multiple Dropbox Accounts
### Connecting to Another Dropbox Account
If you want to switch to a different Dropbox account:
1. First, revoke Rememberizer's access to your current Dropbox account:
* Go to the [Dropbox website](https://www.dropbox.com/) and sign in
* Click your profile picture in the upper-right corner
* Select "Settings" from the dropdown menu
* Navigate to the "Connected apps" tab
* Find Rememberizer in the list and click "Disconnect"
* Sign out of your current Dropbox account
2. Then reconnect in Rememberizer:
* Go to the Knowledge page in Rememberizer
* If your current connection is still active, click the three dots (⋮) menu and select "Disconnect"
* Click "Connect" on the Dropbox card
* You should now be prompted to authorize with your new Dropbox account
Note: If you're still automatically connected to your previous account, try using private browsing/incognito mode to force Dropbox to prompt for authentication.
## Limitations and Considerations
- **Shared Folders**: Some shared folders may have permissions that affect processing
- **Paper Documents**: Dropbox Paper documents have limited support
- **File Size**: Very large files (>50MB) may process slowly or incompletely
- **Rate Limits**: Dropbox API rate limits may temporarily slow processing for large collections
- **Binary Files**: Executable files, images without text, and some specialized formats cannot be processed for text content
## Managing Your Dropbox Connection
### Adding More Files Later
1. Navigate to the Knowledge page
2. Find your Dropbox connection
3. Click the "Select" button to open the file browser
4. Choose additional files or folders
5. Click "Add" to process the new selections
### Removing Access to Files
1. Navigate to the Knowledge page
2. Find your Dropbox connection
3. Click the "Select" button
4. Uncheck files or folders you no longer want indexed
5. Click "Save" to update your selections
### Disconnecting Dropbox
1. Navigate to the Knowledge page
2. Find your Dropbox connection
3. Click the three dots (⋮) menu
4. Select "Disconnect"
5. Confirm the disconnection
## What's Next?
After connecting Dropbox to Rememberizer:
1. Use [Mementos](mementos-filter-access.md) to control which AI tools can access your Dropbox knowledge
2. Combine with other knowledge sources like Slack or Google Drive for more comprehensive context
3. Try [searching your knowledge](https://rememberizer.ai/personal/search) through the web UI
4. Connect your knowledge to AI tools using GPT integration or the Rememberizer API
If you encounter any issues during setup or use, contact our support team for assistance.
==> personal/rememberizer-google-drive-integration.md <==
---
description: >-
This guide will walk you through the process of integrating your Google Drive
into Rememberizer as a knowledge source.
type: guide
last_updated: 2025-04-03
---
# Rememberizer Google Drive Integration
## Overview
The Google Drive integration allows you to connect your Google Drive files and folders to Rememberizer, making your documents searchable through semantic search. This integration enables AI applications to reference your documents, presentations, spreadsheets, and other Google Drive content when answering your questions or providing insights.
## Before You Begin
Before connecting Google Drive to Rememberizer, ensure you:
- Have a Rememberizer account
- Have a Google account with access to Google Drive
- Understand which files and folders you want to make searchable
- Consider organizing your files to make selection easier
- Review any organizational policies about connecting Google Workspace to third-party services
## Connection Process
### Step 1: Access Knowledge Sources
1. Sign in to your Rememberizer account
2. Navigate to **Personal > Your Knowledge** tab, or visit [https://rememberizer.ai/personal/knowledge](https://rememberizer.ai/personal/knowledge)
3. You should see all available knowledge sources, including Google Drive
Knowledge sources page with Google Drive option
### Step 2: Initiate Google Drive Connection
1. Click the **"Connect"** button on the Google Drive knowledge source card
2. You will be redirected to Google's sign-in page
3. Select the Google account you want to connect (if you have multiple accounts)
Select your Google account
### Step 3: Grant Permissions
1. Review the app verification information and click **"Continue"**
App verification screen
2. Review the permission request to **"See and download all your Google Drive files"** and click **"Continue"**
Google Drive permissions screen
### Step 4: Return to Rememberizer
1. After successful authorization, you'll be redirected back to Rememberizer
2. The platform will display a connection confirmation
3. A file selection panel will automatically open
Successful Google Drive connection
### Step 5: Select Files and Folders
1. In the side panel, browse your Google Drive structure
2. Select specific files or entire folders by checking the boxes
3. Navigate through folders using the breadcrumb navigation
4. Use the search function to find specific files or folders
5. If the side panel does not appear, click the **"Select"** button next to your Google Drive connection
Select files and folders to index
### Step 6: Confirm Data Sharing Policy
1. After selecting files, check the box to acknowledge Rememberizer's data sharing policy
2. This confirms you understand that selected files may be accessible to AI applications you authorize
Confirm data sharing policy
### Step 7: Begin Processing
1. Click **"Add"** to start the indexing process
2. Rememberizer will begin downloading, processing, and embedding your files
3. You'll see a progress indicator as files are processed
4. Initial processing may take several minutes to hours depending on the volume and size of files
File processing and indexing progress
## How Google Drive Data is Processed
When you connect Google Drive to Rememberizer, the following occurs:
1. **Authentication**: Secure OAuth connection established with refresh token capability
2. **File Selection**: Only your selected files and folders are accessed
3. **Content Extraction**: Text is extracted from compatible file formats
4. **Content Processing**:
- Documents are chunked into appropriate segments
- Vector embeddings are generated for each chunk
- Metadata such as file name, path, and type is preserved
5. **Continuous Updates**: Changed files are detected and reprocessed automatically
## Supported File Types
The Google Drive integration supports various file formats, including:
| Category | Supported Formats |
|----------|-------------------|
| Google Workspace | Docs, Sheets, Slides, Drawings |
| Microsoft Office | Word (.docx, .doc), Excel (.xlsx, .xls), PowerPoint (.pptx, .ppt) |
| Text Documents | .txt, .md, .rtf, .csv |
| PDF Documents | .pdf |
| Other Formats | .json, .xml, .html, and more |
Large files over 50MB or heavily formatted documents may experience slower processing times.
## Data Refresh and Synchronization
Rememberizer automatically keeps your Google Drive knowledge up to date:
- **Change Detection**: Uses Google Drive's change tracking API to detect modifications
- **Update Schedule**: Automatic synchronization occurs approximately every 4 hours
- **Selective Updates**: Only changed files are reprocessed, not your entire Drive
- **Manual Refresh**: Force an immediate update by clicking the "Refresh" icon next to your Google Drive connection
- **New Files in Selected Folders**: If you select a folder, any new files added to that folder will be automatically detected and processed
## Security and Privacy Considerations
The Google Drive integration includes several security measures:
- **OAuth Security**: Industry-standard authorization with secure token management
- **Selective Access**: Only processes files you explicitly select
- **Encrypted Storage**: All document content is encrypted before storage
- **Read-Only Access**: Cannot modify or delete your Google Drive files
- **Permission Revocation**: You can revoke access at any time through Google account settings
- **Data Handling**: Original files are processed locally and not stored permanently
## Troubleshooting Common Issues
### Authentication Problems
**Problem**: Unable to connect to Google Drive or "Access Denied" errors.
**Solutions**:
- Check if you're using the correct Google account
- Try signing out of Google completely and signing back in
- Verify you don't have browser extensions blocking third-party cookies
- Check if your organization restricts Google Workspace connections
### Missing Files or Folders
**Problem**: Some files or folders don't appear in the selection panel.
**Solutions**:
- Verify file sharing permissions (you must have access to the files)
- Check if files are in the "Computers" backup section (not supported)
- Try refreshing the file browser with the refresh button
- For shared files, ensure you have at least Viewer access
### Processing Errors
**Problem**: Files fail to process or show error status.
**Solutions**:
- Check if the file format is supported
- Verify the file isn't corrupted or password-protected
- For large files, allow more time for processing
- Try reselecting the problematic files
### Synchronization Issues
**Problem**: Updated Google Drive content isn't reflected in searches.
**Solutions**:
- Check when the last sync occurred (visible in the connection details)
- Manually trigger a refresh of your Google Drive connection
- Verify that the file was changed at least 4 hours ago (time for auto-sync)
- Check if the file still exists in the original location
## Limitations of Google Drive Integration
- **"Computers" Section**: Files in the Google Drive "Computers" backup section cannot be accessed due to Google API restrictions
- **Shortcut Files**: Google Drive shortcuts may not process correctly
- **Shared Drives**: Some organization-specific restrictions may apply to Shared Drives
- **File Size**: Very large files (>50MB) may process slowly or incompletely
- **Rate Limits**: Google API rate limits may temporarily slow processing for large collections
For local file integration, consider using the [Rememberizer App](rememberizer-app.md) desktop application.
## Managing Your Google Drive Connection
### Adding More Files Later
1. Navigate to the Knowledge page
2. Find your Google Drive connection
3. Click the "Select" button to open the file browser
4. Choose additional files or folders
5. Click "Save" to process the new selections
### Removing Access to Files
1. Navigate to the Knowledge page
2. Find your Google Drive connection
3. Click the "Select" button
4. Uncheck files or folders you no longer want indexed
5. Click "Save" to update your selections
### Disconnecting Google Drive
1. Navigate to the Knowledge page
2. Find your Google Drive connection
3. Click the three dots (⋮) menu
4. Select "Disconnect"
5. Confirm the disconnection
Additionally, you can revoke Rememberizer's access through your [Google Account Security Settings](https://myaccount.google.com/permissions).
## What's Next?
After connecting Google Drive to Rememberizer:
1. Use [Mementos](mementos-filter-access.md) to control which AI tools can access your Google Drive knowledge
2. Combine with other knowledge sources like Slack or Dropbox for more comprehensive context
3. Try [searching your knowledge](https://rememberizer.ai/personal/search) through the web UI
4. Connect your knowledge to AI tools using GPT integration or the Rememberizer API
If you encounter any issues during setup or use, contact our support team for assistance.
==> personal/README.md <==
---
description: Your guide to Rememberizer's personal knowledge management features
type: guide
last_updated: 2025-04-03
---
# Personal Knowledge Management
Welcome to the personal section of Rememberizer documentation. This section covers all the features you need to harness the power of your personal knowledge and connect it with AI tools and applications.
## Personal Knowledge Management Overview
Rememberizer empowers you to transform scattered information across various sources into an organized, searchable knowledge base that works with AI. With Rememberizer, you can:
- **Connect multiple data sources** including Slack, Google Drive, Gmail, Dropbox, and more
- **Search across all your knowledge** using powerful semantic search technology
- **Organize access to your knowledge** with customizable Mementos
- **Share selected knowledge** through Common Knowledge
- **Connect your knowledge to AI tools** including ChatGPT, LangChain applications, and more
## Core Features
### Mementos: Granular Access Control
[Mementos](mementos-filter-access.md) are at the heart of Rememberizer's personal knowledge management system. These customizable filters allow you to:
- Create collections of specific documents, channels, or folders
- Control exactly what knowledge third-party applications can access
- Maintain privacy while still benefiting from AI capabilities
- Tailor different knowledge sets for different applications
### Powerful Knowledge Search
Rememberizer's [semantic search](search-your-knowledge.md) goes beyond simple keyword matching:
- Find information based on meaning, not just exact terms
- Search across all connected data sources simultaneously
- Filter searches by Memento to focus on relevant knowledge
- Use AI-enhanced agentic search for complex information needs
### Integrations
Connect your existing content from various platforms:
| Integration | Description |
|-------------|-------------|
| [Slack](rememberizer-slack-integration.md) | Access messages and files from your Slack workspaces |
| [Google Drive](rememberizer-google-drive-integration.md) | Connect documents from your Google Drive |
| [Gmail](rememberizer-gmail-integration.md) | Import emails from your Gmail account |
| [Dropbox](rememberizer-dropbox-integration.md) | Access files from your Dropbox account |
| [Memory](rememberizer-memory-integration.md) | Save and retrieve AI conversation history |
| [Rememberizer App](rememberizer-app.md) | Access local files through our desktop application |
### Knowledge Sharing and Enhancement
- [Common Knowledge](common-knowledge.md): Add pre-indexed knowledge from other users
- [Manage Embedded Knowledge](manage-your-embedded-knowledge.md): View and organize your indexed content
- [Third-party Apps](manage-third-party-apps.md): Control which apps can access your knowledge
## Getting Started: The Rememberizer Workflow
1. **Connect your data sources** - Set up integrations with your preferred platforms
2. **Create Mementos** - Organize your knowledge into purpose-specific collections
3. **Refine access** - Select which specific documents belong in each Memento
4. **Search your knowledge** - Use semantic search to find information across sources
5. **Connect with AI tools** - Authorize applications to access specific Mementos
## Documentation Navigation
### Essential Setup
- Start with [Mementos Filter Access](mementos-filter-access.md) to understand the core concept
- Explore the [Rememberizer App](rememberizer-app.md) for local file access
- Learn how to [Search Your Knowledge](search-your-knowledge.md) effectively
### Integrations
- Set up the [Slack integration](rememberizer-slack-integration.md) for team communications
- Connect [Google Drive](rememberizer-google-drive-integration.md) and [Gmail](rememberizer-gmail-integration.md) for Google Workspace content
- Add [Dropbox files](rememberizer-dropbox-integration.md) to your knowledge base
- Configure [Memory integration](rememberizer-memory-integration.md) for conversation history
### Advanced Features
- Explore [Common Knowledge](common-knowledge.md) to enhance your knowledge base
- Learn to [Manage Third-party Apps](manage-third-party-apps.md) that connect to your knowledge
- Understand [MCP Servers](rememberizer-mcp-servers.md) for enhanced capabilities
Ready to get started? Begin by setting up your first [Memento](mementos-filter-access.md) to organize your knowledge.
==> personal/search-your-knowledge.md <==
---
description: >-
In Rememberizer, you can post a theme or question, and Rememberizer will
provide a list of files and extracts parts that are conceptually similar.
---
# Search your knowledge
## Search in Rememberizer
* In the navigation bar, choose **Personal > Search Your Knowledge**. Then you will see the search page in Rememberizer
{% hint style="info" %}
Rememberizer's search uses advanced vector embeddings to find semantically similar content rather than just keyword matches. To learn more about how this technology works, see [What are Vector Embeddings and Vector Databases?](../background/what-are-vector-embeddings-and-vector-databases.md)
Developers can access this same semantic search capability via API. See [Search for documents by semantic similarity](../developer/api-docs/search-for-documents-by-semantic-similarity.md) for details.
{% endhint %}
* Type the question or theme you want to search, then choose the the memento you want to limit the app's access and click Rememberizer button (or press Enter). The search process may take a few minutes, depending on the amount of data in the Memento.
Memento Filtering in search Rememberizer
* Eventually, you will see list of documents matching question or theme you require. You can click to the file and it will dropdown the matching chunk text related to your question or theme.
An example of search result
==> personal/rememberizer-app.md <==
---
description: Learn about the Rememberizer Desktop App that turns your local files into searchable knowledge
type: guide
last_updated: 2025-04-03
---
# Rememberizer App
## Introduction
The Rememberizer App is a desktop application that converts your local files into vector embeddings and uploads them to your Rememberizer knowledge base. This seamless integration enables AI applications to search and reference your personal files through Rememberizer's semantic search capabilities, providing answers based on your content without requiring direct access to your files.
## Benefits
* **Secure Data Integration:** Upload and process your files locally without sharing complete documents with third-party AI services
* **Data Utilization:** Transform your local documents into valuable, searchable knowledge
* **Semantic Understanding:** Leverage vector embeddings to enable concept-based search rather than just keyword matching
* **Powerful AI Integration:** Connect your knowledge to various AI systems including ChatGPT, Claude, and custom applications
* **Privacy Control:** Maintain ownership of your data while making it useful for AI assistants
## Supported Platforms
Currently, Rememberizer App is available for:
* **macOS**: Intel and Apple Silicon (M1/M2/M3) processors
Future planned support (not yet available):
* Windows (in development)
* Linux (under consideration)
## System Requirements
### macOS Requirements
* macOS 10.15 (Catalina) or newer
* 8GB RAM minimum (16GB recommended)
* 500MB free disk space for application
* Additional storage space for processed file caches
* Internet connection for authentication and uploading embeddings
### Hardware Acceleration
* **Apple Silicon Macs:** Automatically uses MPS-enabled PyTorch for optimized performance
* **Intel Macs with compatible GPU:** Can leverage GPU acceleration for faster processing
* **CPU-only systems:** Falls back to CPU processing with intelligent optimization
## Installation
1. Download the latest version of Rememberizer App from [the links provided here](#download-links)
2. Once the download is complete, locate the .dmg file in your downloads folder and double-click it
3. In the window that appears, drag the Rememberizer App icon to the Applications folder
4. Navigate to your Applications folder and open the Rememberizer App
5. If you see a security warning, follow these steps:
- Open System Preferences > Security & Privacy
- Click "Open Anyway" to authorize the app
- The app is securely signed but may trigger this warning on first use
## Configuration and Setup
### First-Time Setup
1. **Sign In:** Launch the app and sign in with your Rememberizer account. A browser window will open to authenticate.
Sign in to connect your Rememberizer account
Successful authentication
2. **Add Data Sources:** After signing in, the app runs in the background. Access it from the menu bar icon. Add folders containing documents you want to process.
Access Rememberizer from the menu bar
Select folders to add as data sources
3. **Processing Files:** The app will begin analyzing and processing files in your selected folders. This involves:
- Scanning files and identifying supported formats
- Chunking file contents into optimally-sized segments
- Converting text into vector embeddings
- Uploading metadata and embeddings to your Rememberizer account
Monitor processing status in the Status tab
### Advanced Configuration
The Rememberizer App offers several configuration options to optimize performance:
1. **Background Processing:** Controls when file processing occurs:
- **Automatic (default):** Processes files continuously in the background
- **Manual:** Processes files only when explicitly triggered
2. **File Type Filtering:** Customize which file types are processed:
- **Default:** Processes all supported file types
- **Custom:** Specify file extensions to include or exclude
3. **Gitignore Support:** Automatically respects `.gitignore` rules in repositories:
- Prevents processing of excluded files
- Maintains consistency with your version control preferences
## Supported File Types
The Rememberizer App can process a wide range of file formats:
| Category | Supported Formats |
|----------|-------------------|
| Text Files | .txt, .md, .rtf, .csv, .json, .xml, .yml, .yaml, and more |
| Documents | .pdf, .doc, .docx, .odt, .pages |
| Presentations | .ppt, .pptx, .key |
| Spreadsheets | .xls, .xlsx, .numbers |
| Code Files | .py, .js, .java, .c, .cpp, .cs, .html, .css, .php, .r, .rb, .go, .rs, .swift, and more |
| Configuration | .ini, .conf, .config, .env |
| Data | .json, .xml, .csv, .tsv |
### File Size and Content Limitations
- Maximum file size: 50MB per file
- Maximum embedded text extraction: 1,000,000 characters per file
- Binary and executable files are not processed
- Password-protected files cannot be processed
- Corrupted files may be skipped
## Security and Privacy
The Rememberizer App implements several security measures:
1. **Local Processing:** Initial file processing occurs locally on your machine
2. **Content Encryption:** Document content is encrypted before transmission
3. **Secure Authentication:** OAuth2 with secure token management
4. **Embedding-Based Storage:** Only vector representations (not original text) are stored long-term
5. **Gitignore Compliance:** Respects exclusion patterns to avoid processing sensitive files
6. **Secure API Communication:** All API traffic uses HTTPS with TLS 1.2+
### Data Usage and Collection
- The app transmits vector embeddings and minimal metadata about your files
- Original file contents are not permanently stored on Rememberizer servers
- Processing occurs locally first with only necessary data transmitted
- No tracking or analytics beyond what's needed for service functionality
## Troubleshooting
### Common Issues and Solutions
#### Application Won't Start
- Verify macOS version (10.15 or newer required)
- Check for available disk space (minimum 500MB)
- Ensure you have admin permissions to install applications
- Try restarting your computer
#### Authentication Problems
- Check your internet connection
- Verify your Rememberizer account credentials
- Clear browser cookies and try again
- Ensure no firewall is blocking communication
#### Files Not Being Processed
- Confirm the file type is supported
- Check file sizes are under the 50MB limit
- Verify folder permissions allow the app to read files
- Check Status tab for specific error messages
- Ensure files aren't being excluded by gitignore rules
#### Slow Processing Performance
- Close resource-intensive applications
- Add fewer folders initially, then expand
- Prioritize smaller text files for faster processing
- Enable GPU acceleration if available
- Check available disk space (low space can cause slowdowns)
### Diagnostic Information
The app maintains logs that can help troubleshoot issues:
1. Access the app's menu by clicking the icon in the menu bar
2. Select "Advanced" > "Show Logs"
3. Review the logs for error messages or warnings
4. If reporting an issue, include relevant log sections
### Resetting the App
If experiencing persistent issues:
1. Quit the Rememberizer App
2. Open Terminal
3. Run: `defaults delete com.rememberizer.app`
4. Restart the application
## Download Links
* Rememberizer App 1.6.1 ([macOS](https://www.dropbox.com/scl/fi/hzytquytxmuhpov67spru/rememberizer-app-1.6.1.dmg?rlkey=0p30ok9qt4e33ua8scomagzev\&st=8yys88d5\&dl=1)) - See [Release Notes](#version-161-october-4th-2024)
Always use the latest version to benefit from security updates, bug fixes, and new features.
## Release Notes
### Version 1.6.1 (October 4th 2024)
#### Features and Improvements
* **Support for Empty Folders**: Users can now add empty folders as a data source.
* **GPU Support and Performance Improvements**: Added support for GPU acceleration to enhance processing speed.
* **Enhanced Embedding Program**: Configured to support the MPS version of PyTorch, optimizing for machine-specific builds.
* **Intelligent CPU Detection**: Implemented detection of CPU type to ensure the most suitable version of the embedding program is used.
* **Improved Data Source Management**: Utilized the Batch Delete API for efficient file deletion in removed data sources.
* **Support for All Plain Text Files**: Enabled processing of various plain text file types.
* **Adherence to Gitignore Rules**: Files ignored by gitignore rules in Git repositories are now excluded from processing.
* **Minor UI Improvements**: Enhancements to the user interface and performance.
## Frequently Asked Questions
### General Questions
**Q: Is the Rememberizer App free to use?**
A: The app is free to download, but requires a Rememberizer account which may have subscription tiers with various limits.
**Q: Does the app extract text from images?**
A: Currently, the app doesn't perform OCR (Optical Character Recognition) on images.
**Q: Will my files be shared with other users?**
A: No. Your files are processed and embedded privately for your account only.
### Technical Questions
**Q: How much of my system resources will the app use?**
A: The app is designed to run efficiently in the background, but resource usage increases during the initial processing of large folders.
**Q: Does the app need to be running all the time?**
A: For continuous file monitoring and updates, yes. However, you can choose to run it only when needed.
**Q: Are there limits to how many files I can process?**
A: Limits depend on your Rememberizer account tier. The app will notify you if you approach these limits.
==> personal/rememberizer-mcp-servers.md <==
---
type: guide
last_updated: 2025-04-03T00:00:00.000Z
description: >-
Configure and use Rememberizer MCP servers to connect your AI assistants with
your knowledge
---
# Rememberizer MCP Servers
The [**Model Context Protocol**](https://modelcontextprotocol.io/introduction) (MCP) is a standardized protocol designed to integrate AI models with various data sources and tools. It supports a client-server architecture facilitating the building of complex workflows and agents with enhanced flexibility and security.
## Rememberizer MCP Server
The [**Rememberizer MCP Server**](https://github.com/skydeckai/mcp-server-rememberizer) is an MCP server tailored for interacting with Rememberizer's document and knowledge management API. It allows LLMs to efficiently search, retrieve, and manage documents and integrations. The server is available as a public package on [mseep.ai](https://mseep.ai/app/skydeckai-mcp-server-rememberizer) and as an open-source project on [GitHub](https://github.com/skydeckai/mcp-server-rememberizer).
### Integration Options
The Rememberizer MCP Server can be installed and integrated through multiple methods:
#### Via uvx
||CODE_BLOCK||bash
uvx mcp-server-rememberizer
||CODE_BLOCK||
#### Via MseeP AI Helper App
If you have MseeP AI Helper app installed, you can search for "Rememberizer" and install the mcp-server-rememberizer.
MseeP AI Helper
### Tools Available
The Rememberizer MCP Server provides the following tools for interacting with your knowledge repository:
1. **retrieve\_semantically\_similar\_internal\_knowledge**
* Finds semantically similar matches from your Rememberizer knowledge repository
* Parameters:
* `match_this` (string, required): The text to find matches for (up to 400 words)
* `n_results` (integer, optional): Number of results to return (default: 5)
* `from_datetime_ISO8601` (string, optional): Filter results from this date
* `to_datetime_ISO8601` (string, optional): Filter results until this date
2. **smart\_search\_internal\_knowledge**
* Performs an agentic search across your knowledge sources
* Parameters:
* `query` (string, required): Your search query (up to 400 words)
* `user_context` (string, optional): Additional context for better results
* `n_results` (integer, optional): Number of results to return (default: 5)
* `from_datetime_ISO8601` (string, optional): Filter results from this date
* `to_datetime_ISO8601` (string, optional): Filter results until this date
3. **list\_internal\_knowledge\_systems**
* Lists all your connected knowledge sources
* No parameters required
4. **rememberizer\_account\_information**
* Retrieves your Rememberizer account details
* No parameters required
5. **list\_personal\_team\_knowledge\_documents**
* Returns a paginated list of all your documents
* Parameters:
* `page` (integer, optional): Page number for pagination (default: 1)
* `page_size` (integer, optional): Documents per page (default: 100, max: 1000)
6. **remember\_this**
* Saves new information to your Rememberizer knowledge system
* Parameters:
* `name` (string, required): Name to identify this information
* `content` (string, required): The information to memorize
### Setup
**Step 1:** Sign up for a new Rememberizer account at [rememberizer.ai](https://rememberizer.ai/).
**Step 2:** Add your knowledge to the Rememberizer platform by connecting to Gmail, Dropbox, or Google Drive, etc...
**Step 3:** To selectively share your knowledge, set up a Mementos Filter. This allows you to choose which information is shared and which remains private. ([Guide here](https://docs.rememberizer.ai/personal/mementos-filter-access))
**Step 4:** Share your knowledge by creating a "Common Knowledge" (Guide [here](https://docs.rememberizer.ai/personal/common-knowledge) and [here](https://docs.rememberizer.ai/developer/registering-and-using-api-keys))
**Step 5:** To access your knowledge via APIs, create an API key ([Guide here](https://docs.rememberizer.ai/developer/registering-and-using-api-keys))
**Step 6:** If you're using Claude Desktop app, add this to your `claude_desktop_config.json` file.
||CODE_BLOCK||json
{
"mcpServers": {
"rememberizer": {
"command": "uvx",
"args": ["mcp-server-rememberizer"],
"env": {
"REMEMBERIZER_API_TOKEN": "your_rememberizer_api_token"
}
}
}
}
||CODE_BLOCK||
**Step 7:** If you're using MseeP AI Helper app, add the env `REMEMBERIZER_API_TOKEN` to mcp-server-rememberizer.
Congratulations, you're done!
With support from the Rememberizer MCP server, you can now ask the following questions in your Claude Desktop app or SkyDeck AI GenStudio
* What is my Rememberizer account?
* List all documents that I have there.
* Give me a quick summary about "..."
## Rememberizer Vector Store MCP Server
The **Rememberizer VectorStore MCP Server** facilitates interaction between LLMs and the Rememberizer Vector Store, enhancing document management and retrieval through semantic similarity searches.
### Integration Options
The Rememberizer Vector Store MCP Server can be installed and integrated through similar methods as the Rememberizer MCP Server:
#### Via uvx
||CODE_BLOCK||bash
uvx mcp-rememberizer-vectordb
||CODE_BLOCK||
#### Via MseeP AI Helper App
If you have MseeP AI Helper app installed, you can search for "Rememberizer Vector Store" and install the mcp-rememberizer-vectordb.
MseeP AI Helper
### Installation
To install the Rememberizer Vector Store MCP Server, follow the [guide here](https://github.com/skydeckai/mcp-rememberizer-vectordb#installation).
### Setup
**Step 1:** Sign up for a new Rememberizer account at [rememberizer.ai](https://rememberizer.ai/).
**Step 2:** Create a new Vector Store ([Guide here](https://docs.rememberizer.ai/developer/vector-stores))
**Step 3:** To manage your Vector Store via APIs, you need to create an API key ([Guide here](https://docs.rememberizer.ai/developer/vector-stores#api-key-management))
**Step 4:** If you're using Claude Desktop app, add this to your `claude_desktop_config.json` file.
||CODE_BLOCK||json
{
"mcpServers": {
"rememberizer": {
"command": "uvx",
"args": ["mcp-rememberizer-vectordb"],
"env": {
"REMEMBERIZER_VECTOR_STORE_API_KEY": "your_rememberizer_api_token"
}
}
}
}
||CODE_BLOCK||
**Step 5:** If you're using MseeP AI Helper app, add the env `REMEMBERIZER_VECTOR_STORE_API_KEY` to mcp-rememberizer-vectordb.
Congratulations, you're done!
With support from the Rememberizer Vector Store MCP server, you can now ask the following questions in your Claude Desktop app or SkyDeck AI GenStudio
* What is my current Rememberizer vector store?
* List all documents that I have there.
* Give me a quick summary about "..."
## Conclusion
The Rememberizer MCP Servers demonstrate the powerful capabilities of the Model Context Protocol by providing an efficient, standardized way to connect AI models with comprehensive data management tools. These servers enhance the ability to search, retrieve, and manage documents with precision, utilizing advanced semantic search methods and the augmentation of LLM Agents.
==> personal/common-knowledge.md <==
---
description: >-
Enhance your knowledge or get started fast by adding AI access to pre-indexed
data from us and others.
---
# Common knowledge
## What is common knowledge
In Rememberizer, registered users **(publishers)** can select their uploaded documents through mementos and share them publicly as common knowledge. Other users **(subscribers)** can access this public knowledge and add it to their own resources.
By contributing their data, other users can collectively enhance the available information on the common knowledge page. This collaborative approach allows all users to access a richer data source, thereby improving the learning capabilities of their AI applications.
## Add public common knowledge
In order to subscribe a common knowledge to your resource, follow the instructions below
* On navigation bar, choose **Personal > Common Knowledge**. Then, you will see the public common knowledge page.
* Then, look for the common knowledge you want to subscribe. You can look up the knowledge by typing the knowledge's name on search bar. You can optionally choose the filter option next to the search bar.
Filter of search bar
Example of a search result
* Then click **Add** button on the public common knowledge. After successful subscribe, you will see the **Add** button change to **Remove** button.
Unadded common knowledge
Added common knowledge
* Later, if you want to remove a subscribed knowledge, click the **Remove** button.
## Create a common knowledge
For detailed instructions of creating and sharing a common knowledge, visit this page [registering-and-using-api-keys.md](../developer/registering-and-using-api-keys.md "mention").
{% hint style="info" %}
Common knowledge is built on the foundation of [Mementos](mementos-filter-access.md), which allow you to control exactly which documents are shared. Once created, developers can access your common knowledge through APIs to build custom applications or integrate with [LangChain](../developer/langchain-integration.md) and [OpenAI GPTs](../developer/creating-a-rememberizer-gpt.md).
{% endhint %}
==> personal/mementos-filter-access.md <==
---
description: Use a Memento with each app integration to limit its access to your Knowledge
---
# Mementos Filter Access
### What is a Memento and Why do they Exist?
A major purpose of Rememberizer is to share highly relevant extracts of your data with 3rd party applications in a controlled fashion. This is achieved through the application of a single **Memento** to each application that is integrated with Rememberizer that you also choose to authorize to access your data in Rememberizer.
{% hint style="info" %}
Mementos are the foundation for [creating Common Knowledge](../developer/registering-and-using-api-keys.md) that developers can access via API and for [creating a Rememberizer GPT](../developer/creating-a-rememberizer-gpt.md).
{% endhint %}
The current implementation of Memento allows the user to select specific files, documents or groups of content such as a folder or channel that can be used by that application. Later implementations will add additional ways to filter 3rd party access such as time frames like "created in the last 30 days".\
\
Two default values are "None" and "All". All shares every file that the user has allowed Rememberizer to access. None shares nothing with the app in question. Selecting None allows a user to select an app and integrate it with Rememberizer without having to decide then and there what content to make available. Selecting a Memento with None or editing an existing applied Memento to share None is a way to turn off an apps access to user data without having to remove the integration. This is like an off switch for your data. Custom Mementos can be purpose made and have names that reflect that, such as "Homework" or "Marketing".
Coming soon: Memento Data Access Control Visualization
This diagram will illustrate how Mementos filter data access between your integrations and third-party apps:
How data flows from various sources (Google Drive, Slack, etc.) into Rememberizer
The role of Mementos as configurable filters between your data and applications
Different permission scenarios with examples (All, None, Custom)
Before/after visualization showing how applying Mementos restricts application access
Visualization of how Mementos control third-party application access to your data
### How to create a Mementos?
This guide will walk you through the process of creating a Mementos
1. Navigate to **Personal > Memento: Limit Access** in tab, or visit [https://rememberizer.ai/personal/memento](https://rememberizer.ai/personal/memento). You should see all Mementos the left of the screen
2. Click **Create a new memento**. Then fill the name for your custom memento and click **Create**. After that, you should your memento added and list of data sources can be included in your memento.
3. Click **Refine** on the data source you want to refine, the side panel will pop up. Then choose to add folders or files, and click **Refine** to add those data sources to the Memento.
4. Also, for common knowledge source, you can click **Add** to include the knowledge in Memento.
==> personal/manage-your-embedded-knowledge.md <==
---
description: >-
Rememberizer allows users to efficiently manage their stored files from
various sources. This section will show you how to access, search, filter and
manage your uploaded file in the knowledge
---
# Manage your embedded knowledge
## Browse Embedded Knowledge Details page
On the navigation bar, choose **Personal > Your Knowledge**. Locate the **View Details** button on the right side of the "Your Knowledge" section and click it. Then, you will see the **Embedded knowledge detail** page.
Your Knowledge section and View Details button
Embed Knowledge Detail page
The table of knowledge files' details includes these attributes:
* **Documents:** Name of the document or slack channel.
* **Source:** The resource from where the file is uploaded (Drive, Mail, Slack, Dropbox, and Rememberizer App).
* **Directory:** The directory where the file locates in the source.
* **Status**: The status of the file (indexing, indexed or error).
* **Size**: The size of the file.
* **Indexed on**: The date when the file is indexed.
* **Actions:** The button to delete the file. For file whose status is error, there will also be a retry icon next to the trash icon (delete button).
## Features of detail page
### Search and filter the files
User can search the document by name with the **search bar**. Type the name in the bar, then press Enter to get your result.
Result of a search
You can also optionally choose the **Status filter** and **Source filter.** This will quickly locate specific documents by narrowing down your search criteria.
Source filter
Status filter
### Delete an uploaded file
Find the file you want to delete (by search if needed). Then click on the **trash icon** on the **Action** column.
File with delete icon
A modal will pop up to confirm deletion. Click **Confirm** then you will see the file deleted.
Delete confirmation modal
### Retry indexing error files
User can retry to embed those files which Rememberizer failed to index. To retry indexing a specific file, simply click the retry button next to the delete button in **Action** column.
Retry button for specific error file
If user want to retry indexing all error files, click the retry button next to the label of **Action** column.
Retry button for all error files
Below is the image after sucessful retry indexing the error file from Gmail integration.
## How Rememberizer Uses Vector Embeddings
In their most advanced form (as used by Rememberizer), vector embeddings are created by language models with architectures similar to the AI LLMs (Large Language Models) that underpin OpenAI's GPT models and ChatGPT service, as well as models/services from Google (Gemini), Anthropic (Claude), Meta (LLaMA), and others.
This makes vector embeddings a natural choice for discovering relevant knowledge to include in the context of AI model prompts. The technologies are complementary and conceptually related. For this reason, most providers of LLMs as a service also produce vector embeddings as a service (for example: [Together AI's embeddings endpoint](https://www.together.ai/blog/embeddings-endpoint-release) or [OpenAI's text and code embeddings](https://openai.com/blog/introducing-text-and-code-embeddings)).
## Understanding Vector Embeddings
What does a vector embedding look like? Consider a coordinate (x,y) in two dimensions. If it represents a line from the origin to this point, we can think of it as a line with a direction—in other words, a _vector in two dimensions_.
In the context of Rememberizer, a vector embedding is typically a list of several hundred numbers (often 768, 1024, or 1536) representing a vector in a high-dimensional space. This list of numbers can represent weights in a Transformer model that define the meaning in a phrase such as "A bolt of lightning out of the blue." This is fundamentally the same underlying representation of meaning used in models like GPT-4. As a result, a good vector embedding enables the same sophisticated understanding that we see in modern AI language models.
## Beyond Text: Multimodal Embeddings
Vector embeddings can represent more than just text—they can also encode other types of data such as images or sound. With properly trained models, you can compare across media types, allowing a vector embedding of text to be compared to an image, or vice versa.
Currently, Rememberizer enables searches within the text component of user documents and knowledge. Text-to-image and image-to-text search capabilities are on Rememberizer's roadmap for future development.
## Real-World Applications
Major technology companies leverage vector embeddings in their products:
* **Google** uses vector embeddings to power both their text search (text-to-text) and image search (text-to-image) capabilities ([reference](https://cloud.google.com/blog/topics/developers-practitioners/meet-ais-multitool-vector-embeddings))
* **Meta (Facebook)** has implemented embeddings for their social network search ([reference](https://research.facebook.com/publications/embedding-based-retrieval-in-facebook-search/))
* **Snapchat** utilizes vector embeddings to understand context and serve targeted advertising ([reference](https://eng.snap.com/machine-learning-snap-ad-ranking))
## How Rememberizer's Vector Search Differs from Keyword Search
Keyword search finds exact matches or predetermined synonyms. In contrast, Rememberizer's vector search finds content that's conceptually related, even when different terminology is used. For example:
* A keyword search for "dog care" might miss a relevant document about "canine health maintenance"
* Rememberizer's vector search would recognize these concepts as semantically similar and return both
This capability makes Rememberizer particularly powerful for retrieving relevant information from diverse knowledge sources.
Coming soon: Vector Search Process Visualization
This diagram will illustrate the complete semantic search workflow in Rememberizer:
Document chunking and preprocessing
Vector embedding generation process
Storage in vector database
Search query embedding
Similarity matching calculation
Side-by-side comparison with traditional keyword search
Visualization of the semantic search process vs. traditional keyword search
## Technical Resources
To deeply understand how vector embeddings and vector databases work:
* Start with the [overview from Hugging Face](https://huggingface.co/blog/getting-started-with-embeddings)
* Pinecone (a vector database service) offers a good [introduction to vector embeddings](https://www.pinecone.io/learn/vector-embeddings/)
* Meta's FAISS library: "FAISS: A Library for Efficient Similarity Search and Clustering of Dense Vectors" by Johnson, Douze, and Jégou (2017) provides comprehensive insights into efficient vector similarity search ([GitHub repository](https://github.com/facebookresearch/faiss))
## The Foundation of Modern AI
The technologies behind vector embeddings have evolved significantly over time:
* The 2017 paper "Attention Is All You Need" ([reference](https://arxiv.org/abs/1706.03762)) introduced the Transformer architecture that powers modern LLMs and advanced embedding models
* "Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality" ([1998](https://dl.acm.org/doi/10.1145/276698.276876), [2010](https://www.theoryofcomputing.org/articles/v008a014/v008a014.pdf)) established the theory for efficient similarity search in high-dimensional spaces
* BERT (2018, [reference](https://arxiv.org/abs/1810.04805)) demonstrated the power of bidirectional training for language understanding tasks
* Earlier methods like GloVe (2014, [reference](https://nlp.stanford.edu/pubs/glove.pdf)) and Word2Vec (2013, [reference](https://arxiv.org/abs/1301.3781)) laid the groundwork for neural word embeddings
For technical implementation details and developer-oriented guidance on using vector stores with Rememberizer, see [Vector Stores](../developer/vector-stores.md).
{% hint style="info" %}
One remarkable aspect of Transformer-based models is their scaling properties—as they use more data and have more parameters, their understanding and capabilities improve dramatically. This scaling property was observed with models like GPT-2 and has driven the rapid advancement of AI capabilities.
Google researchers were behind the original Transformer architecture described in "Attention Is All You Need" ([patent reference](https://patents.google.com/patent/US10452978B2/en)), though many organizations have since built upon and extended this foundational work.
{% endhint %}
==> background/standardized-terminology.md <==
---
description: Standardized terminology and naming conventions for Rememberizer documentation
type: reference
last_updated: 2025-04-03
---
# Standardized Rememberizer Terminology
This document provides a reference for the preferred terminology to use when discussing Rememberizer features and concepts. Following these standards helps maintain consistency across documentation.
## Preferred Terms and Definitions
| Preferred Term | Alternate Terms | Definition |
|---------------|-----------------|------------|
| Vector Store | Vector Database | The preferred term for Rememberizer's vector database implementation is "Vector Store." While "Vector Database" is technically accurate, "Vector Store" should be used for consistency. |
| Vector Embeddings | Embeddings | The full term "Vector Embeddings" is preferred in educational content, while "Embeddings" is acceptable in technical contexts and code examples. |
| Data Source | Knowledge Source, Integration | "Data Source" is the preferred term for referring to the origins of data (Slack, Google Drive, etc.). |
| Common Knowledge | Shared Knowledge | Use "Common Knowledge" when referring to the feature that allows sharing knowledge with other users and applications. |
| Memento | Memento Filter | Use "Memento" as the primary term, though "Memento Filter" is acceptable in UI contexts. |
| Memory Integration | Shared Memory, Memory | "Memory Integration" is the preferred full name of the feature; "Shared Memory" is acceptable in user-facing content. |
| OAuth2 Authentication | OAuth | Use the full term "OAuth2 Authentication" in formal documentation, though "OAuth" is acceptable in less formal contexts. |
| Search Your Knowledge | Search in Rememberizer | "Search Your Knowledge" should be used when referring to the feature name in titles and navigation. |
| Memorize | Store | Use "Memorize" for the API endpoint and functionality name, while "Store" can be used in explanatory contexts. |
| X-API-Key | x-api-key | Use capitalized "X-API-Key" in documentation, though lowercase is acceptable in code examples. |
## API Conventions
### API Documentation Directory
The official API documentation path is `/en/developer/api-docs/`. The legacy path `/en/developer/api-documentations/` should be phased out.
### API Headers
The following header conventions should be used consistently:
- **Authorization Header**: `Authorization: Bearer YOUR_JWT_TOKEN`
- **API Key Header**: `X-API-Key: YOUR_API_KEY`
- **Content-Type Header**: `Content-Type: application/json`
### API Endpoint Styling
API endpoints should be styled consistently:
- Base URL: `https://api.rememberizer.ai/api/v1/`
- Endpoint paths in lowercase with hyphens as needed: `/documents/search/`
- Vector store paths with parameter placeholder: `/vector-stores/{vector_store_id}/documents/search`
## Feature Naming Conventions
### Integration Names
Integration names should follow the pattern:
- Rememberizer {Integration Name} integration (e.g., "Rememberizer Slack integration")
### MCP Server Naming
MCP server types should be clearly distinguished:
- **Rememberizer MCP Server**: General-purpose server
- **Rememberizer Vector Store MCP Server**: Server specifically for vector store operations
## Document Title Conventions
Document titles should follow these conventions:
- Capitalize important words (Title Case)
- Use consistent terminology for features
- Avoid acronyms in titles unless widely recognized (e.g., API)
- Keep titles concise and descriptive
## Using This Guide
When creating or updating documentation, refer to this guide to ensure consistent terminology. When encountering variant terms in the documentation, prioritize updating to the preferred terms listed here when making other changes to those documents.
Remember that maintaining link integrity and file names is crucial, so focus on updating terminology within the text while preserving URLs and file structures.
==> developer/creating-a-rememberizer-gpt.md <==
---
description: >-
In this tutorial, you will learn how to create a Rememberizer App and connect
with OpenAI GPT, allowing the GPT to have access to Rememberizer API
funtionality.
---
# Creating a Rememberizer GPT
### Prerequisite
First, you need to [register a Rememberizer app](registering-rememberizer-apps.md) and configure it with the appropriate settings.
{% hint style="info" %}
If you're interested in alternative integration methods, check out [LangChain Integration](langchain-integration.md) for programmatic access to Rememberizer's semantic search capabilities.
{% endhint %}
To create a GPT, you will need to set the Authorized request origin of your Rememberizer app to`https://chat.openai.com`.
> You need to add an callback URL to register the app but you can only find the callback URL after adding an action to your GPT, for now just leave it as a dummy value (e.g https://chat.openai.com). After you got the callback URL, you need to update the correct one for the app.\
> \
> **Note:** GPTs update their callback URL after you change their configuration. Make sure to copy the latest callback URL.
After creating an app, copy the **Client ID** and **Client Secret**. We will be using them when creating a GPT. The instruction about how to get these information can be visited at [Authorizing Rememberizer apps](https://docs.rememberizer.ai/developer/authorizing-rememberizer-apps).
### Create a GPT
You can start by [creating a GPT in the ChatGPT UI](https://chat.openai.com/gpts/editor).
{% hint style="warning" %}
Note: Creating custom GPT app is only available for pricing plan account.
{% endhint %}
Coming soon: GPT Integration Architecture Diagram
This comprehensive system diagram will illustrate:
The complete architecture between OpenAI GPT, Rememberizer API, and user data sources
Authentication flow with OAuth components
User query flow from GPT → Rememberizer → data sources → back to user
Security boundaries and access controls
How Memento filtering works in this integrated environment
The different endpoints accessed during typical interactions
System architecture diagram showing data flow between GPT, Rememberizer, and integrated data sources
#### GPT configurations
You can fill in the information as you wish. Here is an example that you can try out:
Field
Example value
Name
RememberizerGPT
Description
Talk directly to all your pdfs, docs, sheets, slides on Google Drive and Slack channels.
Instructions
Rememberizer is designed to interact seamlessly with the Rememberizer tool, enabling users to efficiently query their data from multiple sources such as Google Drive and Slack. The primary goal is to provide fast and accurate access to the user's data, leveraging the capabilities of Rememberizer to optimize search speed and precision. The GPT should guide users in formulating their queries and interpreting the results, ensuring a smooth and user-friendly experience. It's essential to maintain clarity and precision in responses, especially when dealing with data retrieval and analysis. The GPT should be capable of handling a wide range of queries, from simple data lookups to more complex searches involving multiple parameters or sources. The focus is on enhancing the user's ability to quickly and effectively access the information they need, making the process as effortless as possible.
#### Create Rememberizer action
From the GPT editor:
1. Select "Configure"
2. "Add Action"
3. Configure authentication type.
* Set the Authentication Type to **OAuth**.
* Paste in the **Client ID** and **Client Secret** from the steps above.
* Authorization URL: `https://api.rememberizer.ai/api/v1/auth/oauth2/authorize/`
* Token URL: `https://api.rememberizer.ai/api/v1/auth/oauth2/token/`
* Leave **Scope** blank.
* Click **Save**.
4. Fill in Rememberizer's OpenAPI spec. Copy the content in the expandable below and paste it into the **Schema** field:
Rememberizer_OpenAPI.yaml
||CODE_BLOCK||yaml
openapi: 3.1.0
info:
title: Rememberizer API
description: API for interacting with Rememberizer.
version: v1
servers:
- url: https://api.rememberizer.ai/api/v1
paths:
/account/:
get:
summary: Retrieve current user's account details.
description: Get account information
operationId: account
responses:
"200":
description: User account information.
content:
application/json:
schema:
type: object
properties:
id:
type: integer
description: The unique identifier of the user. Do not show this information anywhere.
email:
type: string
format: email
description: The email address of the user.
name:
type: string
description: The name of the user.
/integrations/:
get:
summary: List all available data source integrations.
description: This operation retrieves available data sources.
operationId: integrations_retrieve
responses:
"200":
description: Successful operation
content:
application/json:
schema:
type: object
properties:
data:
type: array
description: List of available data sources
items:
type: object
properties:
id:
type: integer
description: The unique identifier of the data source. Do not show this information anywhere.
integration_type:
type: string
description: The type of the data source.
integration_step:
type: string
description: The step of the integration.
source:
type: string
description: The source of the data source. Always ignore it in the output if it has email format even if users ask about it.
document_type:
type: string
description: The type of the document.
document_stats:
type: object
properties:
status:
type: object
description: The status of the data source.
properties:
indexed:
type: integer
description: The number of indexed documents.
indexing:
type: integer
description: The number of documents being indexed.
error:
type: integer
description: The number of documents with errors.
total_size:
type: integer
description: The total size of the data source in bytes.
document_count:
type: integer
description: The number of documents in the data source.
message:
type: string
description: A message indicating the status of the operation.
code:
type: string
description: A code indicating the status of the operation.
/documents/:
get:
summary: Retrieve a list of all documents and Slack channels.
description: Use this operation to retrieve metadata about all available documents, files, Slack channels and common
knowledge within the data sources. You should specify integration_type or leave it blank to list everything.
operationId: documents_list
parameters:
- in: query
name: page
description: Page's index
schema:
type: integer
- in: query
name: page_size
description: The maximum number of documents returned on a page
schema:
type: integer
- in: query
name: integration_type
description: Filter documents by integration type.
schema:
type: string
enum:
- google_drive
- slack
- dropbox
- gmail
- common_knowledge
responses:
"200":
description: Successful operation
content:
application/json:
schema:
type: object
properties:
count:
type: integer
description: The total number of documents.
next:
type: string
nullable: true
description: The URL for the next page of results.
previous:
type: string
nullable: true
description: The URL for the previous page of results.
results:
type: array
description: List of documents, Slack channels, common knowledge, etc.
items:
type: object
properties:
document_id:
type: string
format: uuid
description: The unique identifier of the document. Do not show this information anywhere.
name:
type: string
description: The name of the document.
type:
type: string
description: The type of the document.
path:
type: string
description: The path of the document.
url:
type: string
description: The URL of the document.
id:
type: integer
description: The unique identifier of the document.
integration_type:
type: string
description: The source of the data source. Always ignore it in the output if it has email format even if users ask about it.
source:
type: string
description: The source of the document.
status:
type: string
description: The status of the document.
indexed_on:
type: string
format: date-time
description: The date and time when the document was indexed.
size:
type: integer
description: The size of the document in bytes.
/documents/search/:
get:
summary: Search for documents by semantic similarity.
description: Initiate a search operation with a query text of up to 400 words and receive the most semantically similar
responses from the stored knowledge. For question-answering, convert your question into an ideal answer and
submit it to receive similar real answers.
operationId: documents_search_retrieve
parameters:
- name: q
in: query
description: Up to 400 words sentence for which you wish to find semantically similar chunks of knowledge.
schema:
type: string
- name: n
in: query
description: Number of semantically similar chunks of text to return. Use 'n=3' for up to 5, and 'n=10' for more
information. If you do not receive enough information, consider trying again with a larger 'n' value.
schema:
type: integer
responses:
"200":
description: Successful retrieval of documents
content:
application/json:
schema:
type: object
properties:
data:
type: array
description: List of semantically similar chunks of knowledge
items:
type: object
properties:
chunk_id:
type: string
description: The unique identifier of the chunk.
document:
type: object
description: The document details.
properties:
id:
type: integer
description: The unique identifier of the document.
document_id:
type: string
description: The unique identifier of the document.
name:
type: string
description: The name of the document.
type:
type: string
description: The type of the document.
path:
type: string
description: The path of the document.
url:
type: string
description: The URL of the document.
size:
type: string
description: The size of the document.
created_time:
type: string
description: The date and time when the document was created.
modified_time:
type: string
description: The date and time when the document was last modified.
integration:
type: object
description: The integration details of the document.
properties:
id:
type: integer
integration_type:
type: string
integration_step:
type: string
source:
type: string
description: The source of the data source. Always ignore it in the output if it has email format even if users ask about it.
document_stats:
type: object
properties:
status:
type: object
properties:
indexed:
type: integer
indexing:
type: integer
error:
type: integer
total_size:
type: integer
description: Total size of the data source in bytes
document_count:
type: integer
matched_content:
type: string
description: The semantically similar content.
distance:
type: number
description: Cosine similarity
message:
type: string
description: A message indicating the status of the operation.
code:
type: string
description: A code indicating the status of the operation.
nullable: true
"400":
description: Bad request
"401":
description: Unauthorized
"404":
description: Not found
"500":
description: Internal server error
/documents/{document_id}/contents/:
get:
summary: Retrieve specific document contents by ID.
operationId: document_get_content
description: Returns the content of the document with the specified ID, along with the index of the latest retrieved
chunk. Each call fetches up to 20 chunks. To get more, use the end_chunk value from the response as the
start_chunk for the next call.
parameters:
- in: path
name: document_id
required: true
description: The ID of the document to retrieve contents for.
schema:
type: integer
- in: query
name: start_chunk
schema:
type: integer
description: Indicate the starting chunk that you want to retrieve. If not specified, the default value is 0.
- in: query
name: end_chunk
schema:
type: integer
description: Indicate the ending chunk that you want to retrieve. If not specified, the default value is start_chunk + 20.
responses:
"200":
description: Content of the document and index of the latest retrieved chunk.
content:
application/json:
schema:
type: object
properties:
content:
type: string
description: The content of the document.
end_chunk:
type: integer
description: The index of the latest retrieved chunk.
"404":
description: Document not found.
"500":
description: Internal server error.
/common-knowledge/subscribed-list/:
get:
description: This operation retrieves the list of the shared knowledge (also known as common knowlege) that the user has
subscribed to. Each shared knowledge includes a list of document ids where user can access.
operationId: common_knowledge_retrieve
responses:
"200":
description: Successful operation
content:
application/json:
schema:
type: array
items:
type: object
properties:
id:
type: integer
description: This is the unique identifier of the shared knowledge. Do not show this information anywhere.
num_of_subscribers:
type: integer
description: This indicates the number of users who have subscribed to this shared knowledge
publisher_name:
type: string
published_by_me:
type: boolean
description: This indicates whether the shared knowledge was published by the current user or not
subscribed_by_me:
type: boolean
description: This indicates whether the shared knowledge was subscribed by the current user or not, it should be true in
this API
created:
type: string
description: This is the time when the shared knowledge was created
modified:
type: string
description: This is the time when the shared knowledge was last modified
name:
type: string
description: This is the name of the shared knowledge
image_url:
type: string
description: This is the image url of the shared knowledge
description:
type: string
description: This is the description of the shared knowledge
memento:
type: integer
description: This is the ID of the Rememberizer memento where the shared knowledge was created from.
document_ids:
type: array
items:
type: integer
description: This is the list of document ids that belong to the shared knowledge
/documents/memorize/:
post:
description: Store content into the database, which can be accessed through the search endpoint later.
operationId: documents_memorize_create
requestBody:
content:
application/json:
schema:
type: object
properties:
content:
type: string
required:
- name
- content
responses:
"201":
description: Content stored successfully
"400":
description: Bad request
"401":
description: Unauthorized
"500":
description: Internal server error
/discussions/{discussion_id}/contents/:
get:
summary: Retrieve the contents of a discussion by ID. A discussion can be a Slack or Discord chat.
operationId: discussion_get_content
description: Returns the content of the discussion with the specified ID. A discussion can be a Slack or Discord chat. The response contains 2 fields, discussion_content, and thread_contents. The former contains the main messages of the chat, whereas the latter is the threads of the discussion.
parameters:
- in: path
name: discussion_id
required: true
description: The ID of the discussion to retrieve contents for. Discussions are
schema:
type: integer
- in: query
name: integration_type
required: true
schema:
type: string
description: Indicate the integration of the discussion. Currently, it can only be "slack" or "discord".
- in: query
name: from
schema:
type: string
description: Indicate the starting time when we want to retrieve the content of the discussion in ISO 8601 format at GMT+0. If not specified, the default time is now.
- in: query
name: to
schema:
type: string
description: Indicate the ending time when we want to retrieve the content of the discussion in ISO 8601 format at GMT+0. If not specified, it is 7 days before the "from" parameter.
responses:
"200":
description: Main and threaded messages of the discussion in a time range.
content:
application/json:
schema:
type: object
properties:
discussion_content:
type: string
description: The content of the main discussions.
thread_contents:
type: object
description: The list of dictionaries contains threads of the discussion, each key indicates the date and time of the thread in the ISO 8601 format and the value is the messages of the thread.
"404":
description: Discussion not found.
"500":
description: Internal server error.
||CODE_BLOCK||
5. Add this link as the Privacy Policy: `https://docs.rememberizer.ai/notices/privacy-policy`.
6. After creating the action, copy the callback URL and paste it into your Rememberizer app.
==> developer/registering-and-using-api-keys.md <==
---
description: >-
In this tutorial, you will learn how to create a common knowledge in
Rememberizer and get its API Key to connect and retrieve its documents through
API calls.
---
# Registering and using API Keys
### Prerequisite
First, you need to have [a memento](../personal/mementos-filter-access.md) created and refined using your indexed knowledge files.
### Creating a common knowledge
To create a common knowledge, sign in into your Rememberizer account and visit [your common knowledge page](https://rememberizer.ai/personal/common-knowledge). Choose **"Your shared knowledge"**, then click **"Get started"**.
Then pick one of the mementos you have created previously, you can also choose **"All"** or **"None"**.
Finally fill out the common knowledge's name, description and give it a representative photo.
After you have filled the form, click on "Share knowledge" in the bottom to create your common knowledge. After that, turn on the **"Enable sharing"** in your knowledge and click **"Confirm"** in the pop up modal.
You now are ready to obtain its API Key and access its documents via API calls.
### Getting the API Key of a common knowledge you created
For your common knowledge, click on the three dots on its top right, then choose "API Key". If there is none yet, one will be created for you. If the API Key exists it will be returned.
In the **"Manage your API Key"** panel, you can click on the **"eye"** button to show/hide, the **"copy"** button to copy the key to clipboard, and **"Regenerate API Key"** to invalidate the old key and create a new one (apps that are accessing your documents through api calls won't be able to access until you have updated the new key into them).
After obtaining the API Key, you can proceed to using it in your API calls to Rememberizer to query your indexed documents and contents.
### Using the API Key
To access Rememberizer endpoints, you will use the API Key in the `X-API-Key` header of your API requests. Please check out the [API Documentation](api-docs/) to see the endpoints that Rememberizer provides.
Once you have your API key, you can use it in several ways:
1. **Direct API access**: Use the API key in your HTTP requests to query Rememberizer's search endpoints
2. **LangChain integration**: Use the [LangChain Integration](langchain-integration.md) to incorporate Rememberizer's capabilities into your LangChain applications
3. **Custom GPT**: Use the API Key in a custom GPT application as described below
#### Using with Custom GPTs
Start by [creating a GPT in the ChatGPT UI](https://chat.openai.com/gpts/editor). Make sure to choose the Authentication Type as "API Key", Auth Type as "Custom" and the header as "X-Api-Key", then paste the key you copied previously into the API Key textbox.
{% hint style="info" %}
For a more advanced GPT integration that uses OAuth instead of API keys, see [Creating a Rememberizer GPT](creating-a-rememberizer-gpt.md).
{% endhint %}
==> developer/registering-rememberizer-apps.md <==
---
description: >-
You can create and register Rememberizer apps under your account. Rememberizer
apps can act on behalf of a user.
---
# Registering Rememberizer apps
1. In the top-left corner of any page, click on **Developer**, then click on **Registered App**.
2. Click **Register new app**. A popup window will appear to fill in your app information
3. In **"Application name"**, type the name of your app.
4. In **"Description (optional)"**, fill in the description of your app if needed.
5. In "**Application logo (optional)"**, upload your logo applications if you have.
6. In **"Landing page URL"**, type the domain of your landing page. Your landing page contains a detailed summary of what your app does and how it integrates with Rememberizer.
7. In **"Authorized request origins"**, type the domain to your app's website.
8. In **"Authorized redirect URLs"**, type the callback URL of your app.
9. Click **"Create app"**.
==> developer/langchain-integration.md <==
---
description: >-
Learn how to integrate Rememberizer as a LangChain retriever to provide your
LangChain application with access to powerful vector database search.
type: guide
last_updated: 2025-04-03
---
# LangChain Integration
Rememberizer integrates with LangChain through the `RememberizerRetriever` class, allowing you to easily incorporate Rememberizer's semantic search capabilities into your LangChain-powered applications. This guide explains how to set up and use this integration to build advanced LLM applications with access to your knowledge base.
## Introduction
LangChain is a popular framework for building applications with large language models (LLMs). By integrating Rememberizer with LangChain, you can:
- Use your Rememberizer knowledge base in RAG (Retrieval Augmented Generation) applications
- Create chatbots with access to your documents and data
- Build question-answering systems that leverage your knowledge
- Develop agents that can search and reason over your information
The integration is available in the `langchain_community.retrievers` module.
{% embed url="https://python.langchain.com/docs/integrations/retrievers/rememberizer/" %}
## Getting Started
### Prerequisites
Before you begin, you need:
1. A Rememberizer account with Common Knowledge created
2. An API key for accessing your Common Knowledge
3. Python environment with LangChain installed
For detailed instructions on creating Common Knowledge and generating an API key, see [Registering and Using API Keys](https://docs.rememberizer.ai/developer/registering-and-using-api-keys).
### Installation
Install the required packages:
||CODE_BLOCK||bash
pip install langchain langchain_community
||CODE_BLOCK||
If you plan to use OpenAI models (as shown in examples below):
||CODE_BLOCK||bash
pip install langchain_openai
||CODE_BLOCK||
### Authentication Setup
There are two ways to authenticate the `RememberizerRetriever`:
1. **Environment Variable**: Set the `REMEMBERIZER_API_KEY` environment variable
||CODE_BLOCK||python
import os
os.environ["REMEMBERIZER_API_KEY"] = "rem_ck_your_api_key"
||CODE_BLOCK||
2. **Direct Parameter**: Pass the API key directly when initializing the retriever
||CODE_BLOCK||python
retriever = RememberizerRetriever(rememberizer_api_key="rem_ck_your_api_key")
||CODE_BLOCK||
## Configuration Options
The `RememberizerRetriever` class accepts these parameters:
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `top_k_results` | int | 10 | Number of documents to return from search |
| `rememberizer_api_key` | str | None | API key for authentication (optional if set as environment variable) |
Behind the scenes, the retriever makes API calls to Rememberizer's search endpoint with additional configurable parameters:
| Advanced Parameter | Description |
|-------------------|-------------|
| `prev_chunks` | Number of chunks before the matched chunk to include (default: 2) |
| `next_chunks` | Number of chunks after the matched chunk to include (default: 2) |
| `return_full_content` | Whether to return full document content (default: true) |
## Basic Usage
Here's a simple example of retrieving documents from Rememberizer using LangChain:
||CODE_BLOCK||python
import os
from langchain_community.retrievers import RememberizerRetriever
# Set your API key
os.environ["REMEMBERIZER_API_KEY"] = "rem_ck_your_api_key"
# Initialize the retriever
retriever = RememberizerRetriever(top_k_results=5)
# Get relevant documents for a query
docs = retriever.get_relevant_documents(query="How do vector embeddings work?")
# Display the first document
if docs:
print(f"Document: {docs[0].metadata['name']}")
print(f"Content: {docs[0].page_content[:200]}...")
||CODE_BLOCK||
### Understanding Document Structure
Each document returned by the retriever has:
- `page_content`: The text content of the matched document chunk
- `metadata`: Additional information about the document
Example of metadata structure:
||CODE_BLOCK||python
{
'id': 13646493,
'document_id': '17s3LlMbpkTk0ikvGwV0iLMCj-MNubIaP',
'name': 'What is a large language model (LLM)_ _ Cloudflare.pdf',
'type': 'application/pdf',
'path': '/langchain/What is a large language model (LLM)_ _ Cloudflare.pdf',
'url': 'https://drive.google.com/file/d/17s3LlMbpkTk0ikvGwV0iLMCj-MNubIaP/view',
'size': 337089,
'created_time': '',
'modified_time': '',
'indexed_on': '2024-04-04T03:36:28.886170Z',
'integration': {'id': 347, 'integration_type': 'google_drive'}
}
||CODE_BLOCK||
## Advanced Examples
### Building a RAG Question-Answering System
This example creates a question-answering system that retrieves information from Rememberizer and uses GPT-3.5 to formulate answers:
||CODE_BLOCK||python
import os
from langchain_community.retrievers import RememberizerRetriever
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI
# Set up API keys
os.environ["REMEMBERIZER_API_KEY"] = "rem_ck_your_api_key"
os.environ["OPENAI_API_KEY"] = "your_openai_api_key"
# Initialize the retriever and language model
retriever = RememberizerRetriever(top_k_results=5)
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
# Create a retrieval QA chain
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff", # Simplest method - just stuff all documents into the prompt
retriever=retriever,
return_source_documents=True
)
# Ask a question
response = qa_chain.invoke({"query": "What is RAG in the context of AI?"})
# Print the answer
print(f"Answer: {response['result']}")
print("\nSources:")
for idx, doc in enumerate(response['source_documents']):
print(f"{idx+1}. {doc.metadata['name']}")
||CODE_BLOCK||
### Building a Conversational Agent with Memory
This example creates a conversational agent that can maintain conversation history:
||CODE_BLOCK||python
import os
from langchain_community.retrievers import RememberizerRetriever
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain_openai import ChatOpenAI
# Set up API keys
os.environ["REMEMBERIZER_API_KEY"] = "rem_ck_your_api_key"
os.environ["OPENAI_API_KEY"] = "your_openai_api_key"
# Initialize components
retriever = RememberizerRetriever(top_k_results=5)
llm = ChatOpenAI(model_name="gpt-3.5-turbo")
memory = ConversationBufferMemory(
memory_key="chat_history",
return_messages=True
)
# Create the conversational chain
conversation = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=retriever,
memory=memory
)
# Example conversation
questions = [
"What is RAG?",
"How do large language models use it?",
"What are the limitations of this approach?",
]
for question in questions:
response = conversation.invoke({"question": question})
print(f"Question: {question}")
print(f"Answer: {response['answer']}\n")
||CODE_BLOCK||
## Best Practices
### Optimizing Retrieval Performance
1. **Be specific with queries**: More specific queries usually yield better results
2. **Adjust `top_k_results`**: Start with 3-5 results and adjust based on application needs
3. **Use context windows**: The retriever automatically includes context around matched chunks
### Security Considerations
1. **Protect your API key**: Store it securely using environment variables or secret management tools
2. **Create dedicated keys**: Create separate API keys for different applications
3. **Rotate keys regularly**: Periodically generate new keys and phase out old ones
### Integration Patterns
1. **Pre-retrieval processing**: Consider preprocessing user queries to improve search relevance
2. **Post-retrieval filtering**: Filter or rank retrieved documents before passing to the LLM
3. **Hybrid search**: Combine Rememberizer with other retrievers using `EnsembleRetriever`
||CODE_BLOCK||python
from langchain.retrievers import EnsembleRetriever
from langchain_community.retrievers import RememberizerRetriever, WebResearchRetriever
# Create retrievers
rememberizer_retriever = RememberizerRetriever(top_k_results=3)
web_retriever = WebResearchRetriever(...) # Configure another retriever
# Create an ensemble with weighted score
ensemble_retriever = EnsembleRetriever(
retrievers=[rememberizer_retriever, web_retriever],
weights=[0.7, 0.3] # Rememberizer results have higher weight
)
||CODE_BLOCK||
## Troubleshooting
### Common Issues
1. **Authentication errors**: Verify your API key is correct and properly configured
2. **No results returned**: Ensure your Common Knowledge contains relevant information
3. **Rate limiting**: Be mindful of API rate limits for high-volume applications
### Debug Tips
- Set the LangChain debug mode to see detailed API calls:
||CODE_BLOCK||python
import langchain
langchain.debug = True
||CODE_BLOCK||
- Examine raw search results before passing to LLM to identify retrieval issues
## Related Resources
* LangChain [Retriever conceptual guide](https://python.langchain.com/docs/concepts/#retrievers)
* LangChain [Retriever how-to guides](https://python.langchain.com/docs/how_to/#retrievers)
* Rememberizer [API Documentation](https://docs.rememberizer.ai/developer/api-docs/)
* [Vector Stores](https://docs.rememberizer.ai/developer/vector-stores) in Rememberizer
* [Creating a Rememberizer GPT](creating-a-rememberizer-gpt.md) - An alternative approach for AI integration
==> developer/talk-to-slack-the-sample-web-app.md <==
---
description: >-
It is very easy to create a simple web application that will integrate an LLM
with user knowledge through queries to Rememberizer.
---
# Talk-to-Slack the Sample Web App
The source code of the app can be found [here](https://github.com/skydeckai/rememberizer).
In this section we will provide step by step instructions and the full source code so that you can quickly create your own application.
We have created a Talk-to-Slack GPT on OpenAI. The Talk-to-Slack Web app is very similar.
Talk-to-Slack.com web app by Rememberizer on Heroku
Talk to Slack GPT by Rememberizer on OpenAI
***
### Introduction
In this guide, we provide step-by-step instructions and full source code to help you create your own application similar to our Talk-to-Slack GPT integration with Rememberizer.ai. Unlike the Slack integration, a web app offers more features and control, such as web scraping, local database access, graphics and animation, and collecting payments. Plus, it can be used by anyone without the need for a premium genAI account.
### Overview
Our example application, Talk to Slack, is hosted on Heroku and integrates OpenAI's LLM with Rememberizer.ai to enhance your Slack experience. The web app is built using Flask and provides features like OAuth2 integration, Slack data access, and an intuitive user interface.
### Features
* **Flask-based Architecture**: Backend operations, frontend communications, and API interactions are handled by Flask.
* **OAuth2 Integration**: Secure authorization and data access with Rememberizer's OAuth2 flow.
* **Slack Data Access**: Fetches user's connected Slack data securely using Rememberizer's APIs.
* **OpenAI LLM Integration**: Processes queries with OpenAI's LLM service for insightful responses.
* **Intuitive User Interface**: Easy navigation and interaction with a modern UI design.
* **Best Practices**: Adheres to security and user experience standards for seamless integration.
### Setup and Deployment
#### Prerequisites
* Python
* Flask
{% hint style="info" %}
Note that it was not very hard to have an LLM rewrite this entire application in another language, in our case Golang. So do keep in mind that you are not limited to Python.
{% endhint %}
#### Environment Configuration
Set these environment variables:
* `APP_SECRET_KEY`: Unique secret key for Flask.
* `REMEMBERIZER_CLIENT_ID`: Client ID for your Rememberizer app.
* `REMEMBERIZER_CLIENT_SECRET`: Client secret for your Rememberizer app.
* `OPENAI_API_KEY`: Your OpenAI API key.
#### Running the Application
1. **Start Flask App**: Run `flask run` in the terminal and access the app at `http://localhost:5000`.
2. **Copy the callback URL to your Rememberizer app config**: `https:///auth/rememberizer/callback` example:`http://localhost:5000/auth/rememberizer/callback`.
#### Deploying to the Cloud
Deployment to platforms like Heroku, Google Cloud Platform (GCP), Amazon Web Services (AWS), or Microsoft Azure is recommended.
**Heroku Deployment**
1. **Create a Heroku Account**: Install the Heroku CLI.
2. **Prepare Your Application**: Ensure a `Procfile`, `runtime.txt`, and `requirements.txt` are present.
3. **Deploy**: Use the Heroku CLI or GitHub integration for deployment.
**Detailed Steps**
* **Connect Heroku to GitHub**: Enable automatic deploys from the GitHub repository for seamless updates.
* **Deploy Manually**: Optionally, use manual deployment for more control.
**Additional Setup**
* Install Heroku CLI: `brew tap heroku/brew && brew install heroku` (macOS).
* Add SSL certificates: Use self-signed certificates for initial HTTPS setup.
* Configure Environment Variables on Heroku: Use `heroku config:set KEY=value` for essential keys.
**Other Cloud Platforms**
* **GCP**: Set up a GCP account, prepare your app with `app.yaml`, and deploy using `gcloud app deploy`.
* **AWS**: Use Elastic Beanstalk for deployment after setting up an AWS account and the AWS CLI.
* **Azure**: Deploy through Azure App Service after creating an Azure account and installing the Azure CLI.
#### Security and Best Practices
Before deployment, verify your `requirements.txt`, adjust configurations for production, and update OAuth redirect URIs.
### Application Code Notes
**@app.route('/') (Index Route):**
This route renders the index.html template when the root URL (/) is accessed. It serves as the homepage of your application.
**@app.route('/auth/rememberizer') (Rememberizer Authentication Route):**
This route initiates the OAuth2 authentication process with Rememberizer.ai. It generates a random state value, stores it in the session, constructs the authorization URL with the necessary parameters (client ID, redirect URI, scope, and state), and redirects the user to Rememberizer.ai's authorization page.
**@app.route('/auth/rememberizer/callback') (Rememberizer Callback Route):**
This route handles the callback from Rememberizer.ai after the user has authorized your application. It extracts the authorization code from the query parameters, exchanges it for an access token using Rememberizer.ai's token endpoint, and stores the access token in the session. Then, it redirects the user to the /dashboard route.
**@app.route('/dashboard') (Dashboard Route):**
This route displays the dashboard page to the user. It checks if the user has an access token in the session; if not, it redirects them to the authentication route. If the user is authenticated, it makes a request to Rememberizer.ai's account endpoint to retrieve account information and renders the dashboard.html template with this information.
**@app.route('/slack-info') (Slack Integration Info Route):**
This route shows information about the user's Slack integration with Rememberizer.ai. It checks for an access token and makes a request to Rememberizer.ai's integrations endpoint to get the integration data. It then renders the slack\_info.html template with this data.
**@app.route('/ask', methods=\['POST']) (Ask Route):**
This route handles the submission of questions from the user. It checks for an access token, retrieves the user's question from the form data, and makes a request to Rememberizer.ai's document search endpoint to find relevant information. It then uses OpenAI's GPT-4 model to generate an answer based on the question and the search results. The answer is rendered in the answer.html template.
### Additional Notes
* **Iconography**: Designed with a detailed folded paper art style, reflecting AI and communication integration. Our icon was created in Midjourney and Image2Icon.
* **SSL Configuration**: Generate self-signed certificates using OpenSSL for secure communication.
### Explore and Innovate
We encourage exploration and innovation with your own AI-integrated web app, aiming to enhance productivity and collaboration within your platform.
***
This revised documentation provides a comprehensive guide for developers to create their own AI-integrated web app, similar to Talk-to-Slack. It includes detailed instructions for setup, deployment, and application code overview, along with best
==> developer/README.md <==
---
description: Overview of Rememberizer's developer tools, APIs, and integration options
type: guide
last_updated: 2025-04-03
---
# Developer Tools and APIs
Welcome to the Rememberizer developer documentation. This section provides comprehensive information about the tools, APIs, and integration options available to developers working with Rememberizer's semantic search and knowledge management capabilities.
## Overview of Rememberizer's Developer Features
Rememberizer offers a robust set of developer tools designed to help you integrate powerful semantic search capabilities into your applications. As a developer, you can:
- **Access semantic search** through RESTful APIs with vector embedding technology
- **Integrate Rememberizer** with your own applications using OAuth2 or API keys
- **Build custom applications** that leverage users' knowledge bases
- **Create vector stores** for specialized semantic search databases
- **Connect with AI models** including OpenAI GPTs and LangChain
## Core Components
Rememberizer's architecture consists of several key components that work together to provide a comprehensive knowledge management and semantic search system:
| Component | Description |
|-----------|-------------|
| **API Service** | RESTful endpoints providing access to Rememberizer's features |
| **Authentication System** | OAuth2 and API key management for secure access |
| **Vector Stores** | Specialized databases optimized for semantic search |
| **Mementos** | Configurable access filters for knowledge sources |
| **Integrations** | Connectors to external data sources (Slack, Google Drive, etc.) |
| **Document Processing** | Systems for chunking, embedding, and indexing content |
## Authentication Options
Rememberizer supports two primary authentication methods:
1. **OAuth2 Authentication**: For applications requiring access to specific user data and documents. This flow allows users to authorize your application to access their knowledge through configurable mementos.
2. **API Key Authentication**: For accessing vector stores or common knowledge bases directly, without the OAuth flow. This provides a simpler integration path for applications that don't need user-specific data.
## Developer Documentation Roadmap
This documentation is organized to help you quickly find the information you need:
### Getting Started
- [Registering Rememberizer Apps](registering-rememberizer-apps.md) - Create developer applications
- [Authorizing Rememberizer Apps](authorizing-rememberizer-apps.md) - Implement OAuth2 authorization
- [Registering and Using API Keys](registering-and-using-api-keys.md) - Work with API key authentication
### Core Features
- [Vector Stores](vector-stores.md) - Create and manage semantic search databases
- [Creating a Rememberizer GPT](creating-a-rememberizer-gpt.md) - Integrate with OpenAI's GPT models
- [LangChain Integration](langchain-integration.md) - Connect with LangChain applications
- [Enterprise Integration Patterns](enterprise-integration-patterns.md) - Architectural patterns for enterprise deployments
### API Reference
- [API Documentation](api-docs/README.md) - Comprehensive API reference
- Authentication, search, document management, and more specialized endpoints
### Examples and Sample Code
- [Talk-to-Slack Sample Web App](talk-to-slack-the-sample-web-app.md) - Example integration
## Example Integration Flow
Here's a typical flow for integrating Rememberizer with your application:
1. Register an application in the Rememberizer developer portal
2. Implement OAuth2 authorization in your application
3. Request access to user mementos
4. Make API calls to search and retrieve knowledge
5. Process and display results in your application
||CODE_BLOCK||javascript
// Example: Making an authenticated API request with OAuth token
async function searchUserKnowledge(query, token) {
const response = await fetch('https://api.rememberizer.ai/api/v1/search/', {
method: 'POST',
headers: {
'Authorization': `Bearer ${token}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({ query })
});
return response.json();
}
||CODE_BLOCK||
## Next Steps
Start by [registering your application](registering-rememberizer-apps.md) to obtain client credentials, then explore the [API documentation](api-docs/README.md) to learn about available endpoints.
==> developer/enterprise-integration.md <==
---
description: Overview of enterprise integration capabilities, architectural patterns, and deployment strategies for Rememberizer in organizational environments
type: guide
last_updated: 2025-07-11
---
# Enterprise Integration Overview
Comprehensive guidance for deploying Rememberizer's knowledge management and semantic search capabilities at enterprise scale.
## Enterprise Capabilities
**Multi-Tenant Architecture**
- Isolate knowledge by teams or departments with centralized management
- Scale across multiple business units with independent knowledge bases
- Implement granular access controls and maintain compliance
**System Integration**
- **[Enterprise Integration Patterns](enterprise-integration-patterns.md)** - Architectural guidance for connecting with existing enterprise systems (CRM, ERP, etc.)
## Architectural Patterns
**Hub-and-Spoke Integration** - Connect Rememberizer as a central knowledge repository integrated with multiple enterprise systems
**Microservices Architecture** - Deploy as a specialized knowledge service within your microservices ecosystem
**Zero Trust Security** - Enterprise-grade security with identity-based access control and audit logging
## Authentication and Identity
**Enterprise SSO Integration**
- SAML integration with identity providers (Okta, Azure AD, ADFS)
- LDAP/Active Directory synchronization
- OAuth2 and OpenID Connect support
**Service Account Management**
- API key hierarchies for system-to-system integration
- Automated key rotation and least privilege access
## Deployment Strategies
**Cloud Deployment Options**
- SaaS deployment with managed infrastructure and enterprise SLA
- Private cloud with dedicated tenancy and custom security controls
- Hybrid deployment combining on-premises processing with cloud search services
**High Availability**
- Load balancing with geographic distribution and auto-scaling
- Disaster recovery with automated backups and cross-region replication
## Data Governance
**Compliance Support**
- GDPR, HIPAA, SOX compliance capabilities
- Data classification and retention policies
- Comprehensive audit logging and monitoring
## Getting Started
1. Review [Enterprise Integration Patterns](enterprise-integration-patterns.md) for architectural guidance
2. Assess your integration requirements and security needs
3. Plan a phased rollout starting with a pilot implementation
4. Contact our enterprise team for customized deployment planning
==> developer/vector-stores.md <==
---
description: >-
This guide will help you understand how to use the Rememberizer Vector Store
as a developer.
type: guide
last_updated: 2025-04-03
---
# Vector Stores
The Rememberizer Vector Store simplifies the process of dealing with vector data, allowing you to focus on text input and leveraging the power of vectors for various applications such as search and data analysis.
## Introduction
The Rememberizer Vector Store provides an easy-to-use interface for handling vector data while abstracting away the complexity of vector embeddings. Powered by PostgreSQL with the pgvector extension, Rememberizer Vector Store allows you to work directly with text. The service handles chunking, vectorizing, and storing the text data, making it easier for you to focus on your core application logic.
For a deeper understanding of the theoretical concepts behind vector embeddings and vector databases, see [What are Vector Embeddings and Vector Databases?](../background/what-are-vector-embeddings-and-vector-databases.md).
## Technical Overview
### How Vector Stores Work
Rememberizer Vector Stores convert text into high-dimensional vector representations (embeddings) that capture semantic meaning. This enables:
1. **Semantic Search**: Find documents based on meaning rather than just keywords
2. **Similarity Matching**: Identify conceptually related content
3. **Efficient Retrieval**: Quickly locate relevant information from large datasets
### Key Components
- **Document Processing**: Text is split into optimally sized chunks with overlapping boundaries for context preservation
- **Vectorization**: Chunks are converted to embeddings using state-of-the-art models
- **Indexing**: Specialized algorithms organize vectors for efficient similarity search
- **Query Processing**: Search queries are vectorized and compared against stored embeddings
### Architecture
Rememberizer implements vector stores using:
- **PostgreSQL with pgvector extension**: For efficient vector storage and search
- **Collection-based organization**: Each vector store has its own isolated collection
- **API-driven access**: Simple RESTful endpoints for all operations
## Getting Started
### Creating a Vector Store
1. Navigate to the Vector Stores Section in your dashboard
2. Click on "Create new Vector Store":
* A form will appear prompting you to enter details.
3. Fill in the Details:
* **Name**: Provide a unique name for your vector store.
* **Description**: Write a brief description of the vector store.
* **Embedding Model**: Select the model that converts text to vectors.
* **Indexing Algorithm**: Choose how vectors will be organized for search.
* **Search Metric**: Define how similarity between vectors is calculated.
* **Vector Dimension**: The size of the vector embeddings (typically 768-1536).
4. Submit the Form:
* Click on the "Create" button. You will receive a success notification, and the new store will appear in your vector store list.
Create a New Vector Store
### Configuration Options
#### Embedding Models
| Model | Dimensions | Description | Best For |
|-------|------------|-------------|----------|
| openai/text-embedding-3-large | 1536 | High-accuracy embedding model from OpenAI | Production applications requiring maximum accuracy |
| openai/text-embedding-3-small | 1536 | Smaller, faster embedding model from OpenAI | Applications with higher throughput requirements |
#### Indexing Algorithms
| Algorithm | Description | Tradeoffs |
|-----------|-------------|-----------|
| IVFFLAT (default) | Inverted file with flat compression | Good balance of speed and accuracy; works well for most datasets |
| HNSW | Hierarchical Navigable Small World | Better accuracy for large datasets; higher memory requirements |
#### Search Metrics
| Metric | Description | Best For |
|--------|-------------|----------|
| cosine (default) | Measures angle between vectors | General purpose similarity matching |
| inner product (ip) | Dot product between vectors | When vector magnitude is important |
| L2 (Euclidean) | Straight-line distance between vectors | When spatial relationships matter |
### Managing Vector Stores
1. View and Edit Vector Stores:
* Access the management dashboard to view, edit, or delete vector stores.
2. Viewing Documents:
* Browse individual documents and their associated metadata within a specific vector store.
3. Statistics:
* View detailed statistics such as the number of vectors stored, query performance, and operational metrics.
View Details of a Vector Store
## API Key Management
API keys are used to authenticate and authorize access to the Rememberizer Vector Store's API endpoints. Proper management of API keys is essential for maintaining the security and integrity of your vector stores.
### Creating API Keys
1. Head over to your Vector Store details page
2. Navigate to the API Key Management Section:
* It can be found within the "Configuration" tab
3. Click on **"Add API Key"**:
* A form will appear prompting you to enter details.
4. Fill in the Details:
* **Name**: Provide a name for the API key to help you identify its use case.
5. Submit the Form:
* Click on the "Create" button. The new API key will be generated and displayed. Make sure to copy and store it securely. This key is used to authenticate requests to that specific vector store.
Create a New API Key
### Revoking API Keys
If an API key is no longer needed, you can delete it to prevent any potential misuse.
For security reasons, you may want to rotate your API keys periodically. This involves generating a new key and revoking the old one.
## Using the Vector Store API
After creating a Vector Store and generating an API key, you can interact with it using the REST API.
### Code Examples
{% tabs %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
API_KEY = "your_api_key_here"
VECTOR_STORE_ID = "vs_abc123" # Replace with your vector store ID
BASE_URL = "https://api.rememberizer.ai/api/v1"
# Upload a document to the vector store
def upload_document(file_path, document_name=None):
if document_name is None:
document_name = file_path.split("/")[-1]
with open(file_path, "rb") as f:
files = {"file": (document_name, f)}
headers = {"x-api-key": API_KEY}
response = requests.post(
f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents",
headers=headers,
files=files
)
if response.status_code == 201:
print(f"Document '{document_name}' uploaded successfully!")
return response.json()
else:
print(f"Error uploading document: {response.text}")
return None
# Upload text content to the vector store
def upload_text(content, document_name):
headers = {
"x-api-key": API_KEY,
"Content-Type": "application/json"
}
data = {
"name": document_name,
"content": content
}
response = requests.post(
f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents/text",
headers=headers,
json=data
)
if response.status_code == 201:
print(f"Text document '{document_name}' uploaded successfully!")
return response.json()
else:
print(f"Error uploading text: {response.text}")
return None
# Search the vector store
def search_vector_store(query, num_results=5, prev_chunks=1, next_chunks=1):
headers = {"x-api-key": API_KEY}
params = {
"q": query,
"n": num_results,
"prev_chunks": prev_chunks,
"next_chunks": next_chunks
}
response = requests.get(
f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents/search",
headers=headers,
params=params
)
if response.status_code == 200:
results = response.json()
print(f"Found {len(results['matched_chunks'])} matches for '{query}'")
# Print the top result
if results['matched_chunks']:
top_match = results['matched_chunks'][0]
print(f"Top match (distance: {top_match['distance']}):")
print(f"Document: {top_match['document']['name']}")
print(f"Content: {top_match['matched_content']}")
return results
else:
print(f"Error searching: {response.text}")
return None
# Example usage
# upload_document("path/to/document.pdf")
# upload_text("This is a sample text to be vectorized", "sample-document.txt")
# search_vector_store("How does vector similarity work?")
||CODE_BLOCK||
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
// Vector Store API Client
class VectorStoreClient {
constructor(apiKey, vectorStoreId) {
this.apiKey = apiKey;
this.vectorStoreId = vectorStoreId;
this.baseUrl = 'https://api.rememberizer.ai/api/v1';
}
// Get vector store information
async getVectorStoreInfo() {
const response = await fetch(`${this.baseUrl}/vector-stores/${this.vectorStoreId}`, {
method: 'GET',
headers: {
'x-api-key': this.apiKey
}
});
if (!response.ok) {
throw new Error(`Failed to get vector store info: ${response.statusText}`);
}
return response.json();
}
// Upload a text document
async uploadTextDocument(name, content) {
const response = await fetch(`${this.baseUrl}/vector-stores/${this.vectorStoreId}/documents/text`, {
method: 'POST',
headers: {
'x-api-key': this.apiKey,
'Content-Type': 'application/json'
},
body: JSON.stringify({
name,
content
})
});
if (!response.ok) {
throw new Error(`Failed to upload text document: ${response.statusText}`);
}
return response.json();
}
// Upload a file
async uploadFile(file, onProgress) {
const formData = new FormData();
formData.append('file', file);
const xhr = new XMLHttpRequest();
return new Promise((resolve, reject) => {
xhr.open('POST', `${this.baseUrl}/vector-stores/${this.vectorStoreId}/documents`);
xhr.setRequestHeader('x-api-key', this.apiKey);
xhr.upload.onprogress = (event) => {
if (event.lengthComputable && onProgress) {
const percentComplete = (event.loaded / event.total) * 100;
onProgress(percentComplete);
}
};
xhr.onload = () => {
if (xhr.status === 201) {
resolve(JSON.parse(xhr.responseText));
} else {
reject(new Error(`Failed to upload file: ${xhr.statusText}`));
}
};
xhr.onerror = () => {
reject(new Error('Network error during file upload'));
};
xhr.send(formData);
});
}
// Search documents in the vector store
async searchDocuments(query, options = {}) {
const params = new URLSearchParams({
q: query,
n: options.numResults || 10,
prev_chunks: options.prevChunks || 1,
next_chunks: options.nextChunks || 1
});
if (options.threshold) {
params.append('t', options.threshold);
}
const response = await fetch(
`${this.baseUrl}/vector-stores/${this.vectorStoreId}/documents/search?${params}`,
{
method: 'GET',
headers: {
'x-api-key': this.apiKey
}
}
);
if (!response.ok) {
throw new Error(`Search failed: ${response.statusText}`);
}
return response.json();
}
// List all documents in the vector store
async listDocuments() {
const response = await fetch(
`${this.baseUrl}/vector-stores/${this.vectorStoreId}/documents`,
{
method: 'GET',
headers: {
'x-api-key': this.apiKey
}
}
);
if (!response.ok) {
throw new Error(`Failed to list documents: ${response.statusText}`);
}
return response.json();
}
// Delete a document
async deleteDocument(documentId) {
const response = await fetch(
`${this.baseUrl}/vector-stores/${this.vectorStoreId}/documents/${documentId}`,
{
method: 'DELETE',
headers: {
'x-api-key': this.apiKey
}
}
);
if (!response.ok) {
throw new Error(`Failed to delete document: ${response.statusText}`);
}
return true;
}
}
// Example usage
/*
const client = new VectorStoreClient('your_api_key', 'vs_abc123');
// Search documents
client.searchDocuments('How does semantic search work?')
.then(results => {
console.log(`Found ${results.matched_chunks.length} matches`);
results.matched_chunks.forEach(match => {
console.log(`Document: ${match.document.name}`);
console.log(`Score: ${match.distance}`);
console.log(`Content: ${match.matched_content}`);
console.log('---');
});
})
.catch(error => console.error(error));
*/
||CODE_BLOCK||
{% endtab %}
{% tab title="Ruby" %}
||CODE_BLOCK||ruby
require 'net/http'
require 'uri'
require 'json'
class VectorStoreClient
def initialize(api_key, vector_store_id)
@api_key = api_key
@vector_store_id = vector_store_id
@base_url = 'https://api.rememberizer.ai/api/v1'
end
# Get vector store details
def get_vector_store_info
uri = URI("#{@base_url}/vector-stores/#{@vector_store_id}")
request = Net::HTTP::Get.new(uri)
request['x-api-key'] = @api_key
response = send_request(uri, request)
JSON.parse(response.body)
end
# Upload text content
def upload_text(name, content)
uri = URI("#{@base_url}/vector-stores/#{@vector_store_id}/documents/text")
request = Net::HTTP::Post.new(uri)
request['Content-Type'] = 'application/json'
request['x-api-key'] = @api_key
request.body = {
name: name,
content: content
}.to_json
response = send_request(uri, request)
JSON.parse(response.body)
end
# Search documents
def search(query, num_results: 5, prev_chunks: 1, next_chunks: 1, threshold: nil)
uri = URI("#{@base_url}/vector-stores/#{@vector_store_id}/documents/search")
params = {
q: query,
n: num_results,
prev_chunks: prev_chunks,
next_chunks: next_chunks
}
params[:t] = threshold if threshold
uri.query = URI.encode_www_form(params)
request = Net::HTTP::Get.new(uri)
request['x-api-key'] = @api_key
response = send_request(uri, request)
JSON.parse(response.body)
end
# List documents
def list_documents
uri = URI("#{@base_url}/vector-stores/#{@vector_store_id}/documents")
request = Net::HTTP::Get.new(uri)
request['x-api-key'] = @api_key
response = send_request(uri, request)
JSON.parse(response.body)
end
# Upload file (multipart form)
def upload_file(file_path)
uri = URI("#{@base_url}/vector-stores/#{@vector_store_id}/documents")
file_name = File.basename(file_path)
file_content = File.binread(file_path)
boundary = "RememberizerBoundary#{rand(1000000)}"
request = Net::HTTP::Post.new(uri)
request['Content-Type'] = "multipart/form-data; boundary=#{boundary}"
request['x-api-key'] = @api_key
post_body = []
post_body << "--#{boundary}\r\n"
post_body << "Content-Disposition: form-data; name=\"file\"; filename=\"#{file_name}\"\r\n"
post_body << "Content-Type: application/octet-stream\r\n\r\n"
post_body << file_content
post_body << "\r\n--#{boundary}--\r\n"
request.body = post_body.join
response = send_request(uri, request)
JSON.parse(response.body)
end
private
def send_request(uri, request)
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = (uri.scheme == 'https')
response = http.request(request)
unless response.is_a?(Net::HTTPSuccess)
raise "API request failed: #{response.code} #{response.message}\n#{response.body}"
end
response
end
end
# Example usage
=begin
client = VectorStoreClient.new('your_api_key', 'vs_abc123')
# Search for documents
results = client.search('What are the best practices for data security?')
puts "Found #{results['matched_chunks'].length} results"
# Display top result
if results['matched_chunks'].any?
top_match = results['matched_chunks'].first
puts "Top match (distance: #{top_match['distance']}):"
puts "Document: #{top_match['document']['name']}"
puts "Content: #{top_match['matched_content']}"
end
=end
||CODE_BLOCK||
{% endtab %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
# Set your API key and Vector Store ID
API_KEY="your_api_key_here"
VECTOR_STORE_ID="vs_abc123"
BASE_URL="https://api.rememberizer.ai/api/v1"
# Get vector store information
curl -X GET "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}" \
-H "x-api-key: ${API_KEY}"
# Upload a text document
curl -X POST "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}/documents/text" \
-H "x-api-key: ${API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"name": "example-document.txt",
"content": "This is a sample document that will be vectorized and stored in the vector database for semantic search."
}'
# Upload a file
curl -X POST "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}/documents" \
-H "x-api-key: ${API_KEY}" \
-F "file=@/path/to/your/document.pdf"
# Search for documents
curl -X GET "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}/documents/search?q=semantic%20search&n=5&prev_chunks=1&next_chunks=1" \
-H "x-api-key: ${API_KEY}"
# List all documents
curl -X GET "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}/documents" \
-H "x-api-key: ${API_KEY}"
# Delete a document
curl -X DELETE "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}/documents/123" \
-H "x-api-key: ${API_KEY}"
||CODE_BLOCK||
{% endtab %}
{% endtabs %}
## Performance Considerations
Coming soon: Vector Store Architecture Diagram
This technical architecture diagram will illustrate:
The PostgreSQL + pgvector foundation architecture
Indexing algorithm structures (IVFFLAT vs. HNSW)
How search metrics work in vector space (visual comparison)
Document chunking process with overlap visualization
Performance considerations visualized across different scales
Technical architecture of Rememberizer Vector Store implementation
### Optimizing for Different Data Volumes
| Data Volume | Recommended Configuration | Notes |
|-------------|---------------------------|-------|
| Small (<10k documents) | IVFFLAT, cosine similarity | Simple configuration provides good performance |
| Medium (10k-100k documents) | IVFFLAT, ensure regular reindexing | Balance between search speed and index maintenance |
| Large (>100k documents) | HNSW, consider increasing vector dimensions | Higher memory usage but maintains performance at scale |
### Chunking Strategies
The chunking process significantly impacts search quality:
- **Chunk Size**: Rememberizer uses a default chunk size of 1024 bytes with a 200-byte overlap
- **Smaller Chunks** (512-1024 bytes): More precise matches, better for specific questions
- **Larger Chunks** (1500-2048 bytes): More context in each match, better for broader topics
- **Overlap**: Ensures context is not lost at chunk boundaries
### Query Optimization
- **Context Windows**: Use `prev_chunks` and `next_chunks` to retrieve surrounding content
- **Results Count**: Start with 3-5 results (`n` parameter) and adjust based on precision needs
- **Threshold**: Adjust the `t` parameter to filter results by similarity score
## Advanced Usage
### Reindexing
Rememberizer automatically triggers reindexing when vector counts exceed predefined thresholds, but consider manual reindexing after:
- Uploading a large number of documents
- Changing the embedding model
- Modifying the indexing algorithm
### Query Enhancement
For better search results:
1. **Be specific** in search queries
2. **Include context** when possible
3. **Use natural language** rather than keywords
4. **Adjust parameters** based on result quality
## Migrating from Other Vector Databases
If you're currently using other vector database solutions and want to migrate to Rememberizer Vector Store, the following guides will help you transition your data efficiently.
### Migration Overview
Migrating vector data involves:
1. Exporting data from your source vector database
2. Converting the data to a format compatible with Rememberizer
3. Importing the data into your Rememberizer Vector Store
4. Verifying the migration was successful
### Benefits of Migrating to Rememberizer
- **PostgreSQL Foundation**: Built on mature database technology with built-in backup and recovery
- **Integrated Ecosystem**: Seamless connection with other Rememberizer components
- **Simplified Management**: Unified interface for vector operations
- **Advanced Security**: Row-level security and fine-grained access controls
- **Scalable Architecture**: Performance optimization as your data grows
### Migrating from Pinecone
{% tabs %}
{% tab title="Python" %}
||CODE_BLOCK||python
import os
import pinecone
import requests
import json
import time
# Set up Pinecone client
pinecone.init(api_key="PINECONE_API_KEY", environment="PINECONE_ENV")
source_index = pinecone.Index("your-pinecone-index")
# Set up Rememberizer Vector Store client
REMEMBERIZER_API_KEY = "your_rememberizer_api_key"
VECTOR_STORE_ID = "vs_abc123" # Your Rememberizer vector store ID
BASE_URL = "https://api.rememberizer.ai/api/v1"
# 1. Set up batch size for migration (adjust based on your data size)
BATCH_SIZE = 100
# 2. Function to get vectors from Pinecone
def fetch_vectors_from_pinecone(index_name, batch_size, cursor=None):
# Use the list operation if available in your Pinecone version
try:
result = source_index.list(limit=batch_size, cursor=cursor)
vectors = result.get("vectors", {})
next_cursor = result.get("cursor")
return vectors, next_cursor
except AttributeError:
# For older Pinecone versions without list operation
# This is a simplified approach; actual implementation depends on your data access pattern
query_response = source_index.query(
vector=[0] * source_index.describe_index_stats()["dimension"],
top_k=batch_size,
include_metadata=True,
include_values=True
)
return {item.id: {"id": item.id, "values": item.values, "metadata": item.metadata}
for item in query_response.matches}, None
# 3. Function to upload vectors to Rememberizer
def upload_to_rememberizer(vectors):
headers = {
"x-api-key": REMEMBERIZER_API_KEY,
"Content-Type": "application/json"
}
for vector_id, vector_data in vectors.items():
# Convert Pinecone vector data to Rememberizer format
document_name = vector_data.get("metadata", {}).get("filename", f"pinecone_doc_{vector_id}")
content = vector_data.get("metadata", {}).get("text", "")
if not content:
print(f"Skipping {vector_id} - no text content found in metadata")
continue
data = {
"name": document_name,
"content": content,
# Optional: include additional metadata
"metadata": vector_data.get("metadata", {})
}
response = requests.post(
f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents/text",
headers=headers,
json=data
)
if response.status_code == 201:
print(f"Document '{document_name}' uploaded successfully!")
else:
print(f"Error uploading document {document_name}: {response.text}")
# Add a small delay to prevent rate limiting
time.sleep(0.1)
# 4. Main migration function
def migrate_pinecone_to_rememberizer():
cursor = None
total_migrated = 0
print("Starting migration from Pinecone to Rememberizer...")
while True:
vectors, cursor = fetch_vectors_from_pinecone("your-pinecone-index", BATCH_SIZE, cursor)
if not vectors:
break
print(f"Fetched {len(vectors)} vectors from Pinecone")
upload_to_rememberizer(vectors)
total_migrated += len(vectors)
print(f"Progress: {total_migrated} vectors migrated")
if not cursor:
break
print(f"Migration complete! {total_migrated} total vectors migrated to Rememberizer")
# Run the migration
# migrate_pinecone_to_rememberizer()
||CODE_BLOCK||
{% endtab %}
{% tab title="Node.js" %}
||CODE_BLOCK||javascript
const { PineconeClient } = require('@pinecone-database/pinecone');
const axios = require('axios');
// Pinecone configuration
const pineconeApiKey = 'PINECONE_API_KEY';
const pineconeEnvironment = 'PINECONE_ENVIRONMENT';
const pineconeIndexName = 'YOUR_PINECONE_INDEX';
// Rememberizer configuration
const rememberizerApiKey = 'YOUR_REMEMBERIZER_API_KEY';
const vectorStoreId = 'vs_abc123';
const baseUrl = 'https://api.rememberizer.ai/api/v1';
// Batch size configuration
const BATCH_SIZE = 100;
// Initialize Pinecone client
async function initPinecone() {
const pinecone = new PineconeClient();
await pinecone.init({
apiKey: pineconeApiKey,
environment: pineconeEnvironment,
});
return pinecone;
}
// Fetch vectors from Pinecone
async function fetchVectorsFromPinecone(pinecone, batchSize, paginationToken = null) {
const index = pinecone.Index(pineconeIndexName);
try {
// For newer Pinecone versions
const listResponse = await index.list({
limit: batchSize,
paginationToken: paginationToken
});
return {
vectors: listResponse.vectors || {},
nextToken: listResponse.paginationToken
};
} catch (error) {
// Fallback for older Pinecone versions
// This is simplified; actual implementation depends on your data access pattern
const stats = await index.describeIndexStats();
const dimension = stats.dimension;
const queryResponse = await index.query({
vector: Array(dimension).fill(0),
topK: batchSize,
includeMetadata: true,
includeValues: true
});
const vectors = {};
queryResponse.matches.forEach(match => {
vectors[match.id] = {
id: match.id,
values: match.values,
metadata: match.metadata
};
});
return { vectors, nextToken: null };
}
}
// Upload vectors to Rememberizer
async function uploadToRememberizer(vectors) {
const headers = {
'x-api-key': rememberizerApiKey,
'Content-Type': 'application/json'
};
const results = [];
for (const [vectorId, vectorData] of Object.entries(vectors)) {
const documentName = vectorData.metadata?.filename || `pinecone_doc_${vectorId}`;
const content = vectorData.metadata?.text || '';
if (!content) {
console.log(`Skipping ${vectorId} - no text content found in metadata`);
continue;
}
const data = {
name: documentName,
content: content,
// Optional: include additional metadata
metadata: vectorData.metadata || {}
};
try {
const response = await axios.post(
`${baseUrl}/vector-stores/${vectorStoreId}/documents/text`,
data,
{ headers }
);
if (response.status === 201) {
console.log(`Document '${documentName}' uploaded successfully!`);
results.push({ id: vectorId, success: true });
} else {
console.error(`Error uploading document ${documentName}: ${response.statusText}`);
results.push({ id: vectorId, success: false, error: response.statusText });
}
} catch (error) {
console.error(`Error uploading document ${documentName}: ${error.message}`);
results.push({ id: vectorId, success: false, error: error.message });
}
// Add a small delay to prevent rate limiting
await new Promise(resolve => setTimeout(resolve, 100));
}
return results;
}
// Main migration function
async function migratePineconeToRememberizer() {
try {
console.log('Starting migration from Pinecone to Rememberizer...');
const pinecone = await initPinecone();
let nextToken = null;
let totalMigrated = 0;
do {
const { vectors, nextToken: token } = await fetchVectorsFromPinecone(
pinecone,
BATCH_SIZE,
nextToken
);
nextToken = token;
if (Object.keys(vectors).length === 0) {
break;
}
console.log(`Fetched ${Object.keys(vectors).length} vectors from Pinecone`);
const results = await uploadToRememberizer(vectors);
const successCount = results.filter(r => r.success).length;
totalMigrated += successCount;
console.log(`Progress: ${totalMigrated} vectors migrated successfully`);
} while (nextToken);
console.log(`Migration complete! ${totalMigrated} total vectors migrated to Rememberizer`);
} catch (error) {
console.error('Migration failed:', error);
}
}
// Run the migration
// migratePineconeToRememberizer();
||CODE_BLOCK||
{% endtab %}
{% endtabs %}
### Migrating from Qdrant
{% tabs %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
import time
from qdrant_client import QdrantClient
from qdrant_client.http import models as rest
# Set up Qdrant client
QDRANT_URL = "http://localhost:6333" # or your Qdrant cloud URL
QDRANT_API_KEY = "your_qdrant_api_key" # if using Qdrant Cloud
QDRANT_COLLECTION_NAME = "your_collection"
qdrant_client = QdrantClient(
url=QDRANT_URL,
api_key=QDRANT_API_KEY # Only for Qdrant Cloud
)
# Set up Rememberizer Vector Store client
REMEMBERIZER_API_KEY = "your_rememberizer_api_key"
VECTOR_STORE_ID = "vs_abc123" # Your Rememberizer vector store ID
BASE_URL = "https://api.rememberizer.ai/api/v1"
# Batch size for processing
BATCH_SIZE = 100
# Function to fetch points from Qdrant
def fetch_points_from_qdrant(collection_name, batch_size, offset=0):
try:
# Get collection info to determine vector dimension
collection_info = qdrant_client.get_collection(collection_name=collection_name)
# Scroll through points
scroll_result = qdrant_client.scroll(
collection_name=collection_name,
limit=batch_size,
offset=offset,
with_payload=True,
with_vectors=True
)
points = scroll_result[0] # Tuple of (points, next_offset)
next_offset = scroll_result[1]
return points, next_offset
except Exception as e:
print(f"Error fetching points from Qdrant: {e}")
return [], None
# Function to upload vectors to Rememberizer
def upload_to_rememberizer(points):
headers = {
"x-api-key": REMEMBERIZER_API_KEY,
"Content-Type": "application/json"
}
results = []
for point in points:
# Extract data from Qdrant point
point_id = point.id
metadata = point.payload
text_content = metadata.get("text", "")
document_name = metadata.get("filename", f"qdrant_doc_{point_id}")
if not text_content:
print(f"Skipping {point_id} - no text content found in payload")
continue
data = {
"name": document_name,
"content": text_content,
# Optional: include additional metadata
"metadata": metadata
}
try:
response = requests.post(
f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents/text",
headers=headers,
json=data
)
if response.status_code == 201:
print(f"Document '{document_name}' uploaded successfully!")
results.append({"id": point_id, "success": True})
else:
print(f"Error uploading document {document_name}: {response.text}")
results.append({"id": point_id, "success": False, "error": response.text})
except Exception as e:
print(f"Exception uploading document {document_name}: {str(e)}")
results.append({"id": point_id, "success": False, "error": str(e)})
# Add a small delay to prevent rate limiting
time.sleep(0.1)
return results
# Main migration function
def migrate_qdrant_to_rememberizer():
offset = None
total_migrated = 0
print("Starting migration from Qdrant to Rememberizer...")
while True:
points, next_offset = fetch_points_from_qdrant(
QDRANT_COLLECTION_NAME,
BATCH_SIZE,
offset
)
if not points:
break
print(f"Fetched {len(points)} points from Qdrant")
results = upload_to_rememberizer(points)
success_count = sum(1 for r in results if r.get("success", False))
total_migrated += success_count
print(f"Progress: {total_migrated} points migrated successfully")
if next_offset is None:
break
offset = next_offset
print(f"Migration complete! {total_migrated} total points migrated to Rememberizer")
# Run the migration
# migrate_qdrant_to_rememberizer()
||CODE_BLOCK||
{% endtab %}
{% tab title="Node.js" %}
||CODE_BLOCK||javascript
const { QdrantClient } = require('@qdrant/js-client-rest');
const axios = require('axios');
// Qdrant configuration
const qdrantUrl = 'http://localhost:6333'; // or your Qdrant cloud URL
const qdrantApiKey = 'your_qdrant_api_key'; // if using Qdrant Cloud
const qdrantCollectionName = 'your_collection';
// Rememberizer configuration
const rememberizerApiKey = 'YOUR_REMEMBERIZER_API_KEY';
const vectorStoreId = 'vs_abc123';
const baseUrl = 'https://api.rememberizer.ai/api/v1';
// Batch size configuration
const BATCH_SIZE = 100;
// Initialize Qdrant client
const qdrantClient = new QdrantClient({
url: qdrantUrl,
apiKey: qdrantApiKey // Only for Qdrant Cloud
});
// Fetch points from Qdrant
async function fetchPointsFromQdrant(collectionName, batchSize, offset = 0) {
try {
// Get collection info
const collectionInfo = await qdrantClient.getCollection(collectionName);
// Scroll through points
const scrollResult = await qdrantClient.scroll(collectionName, {
limit: batchSize,
offset: offset,
with_payload: true,
with_vectors: true
});
return {
points: scrollResult.points,
nextOffset: scrollResult.next_page_offset
};
} catch (error) {
console.error(`Error fetching points from Qdrant: ${error.message}`);
return { points: [], nextOffset: null };
}
}
// Upload vectors to Rememberizer
async function uploadToRememberizer(points) {
const headers = {
'x-api-key': rememberizerApiKey,
'Content-Type': 'application/json'
};
const results = [];
for (const point of points) {
// Extract data from Qdrant point
const pointId = point.id;
const metadata = point.payload || {};
const textContent = metadata.text || '';
const documentName = metadata.filename || `qdrant_doc_${pointId}`;
if (!textContent) {
console.log(`Skipping ${pointId} - no text content found in payload`);
continue;
}
const data = {
name: documentName,
content: textContent,
// Optional: include additional metadata
metadata: metadata
};
try {
const response = await axios.post(
`${baseUrl}/vector-stores/${vectorStoreId}/documents/text`,
data,
{ headers }
);
if (response.status === 201) {
console.log(`Document '${documentName}' uploaded successfully!`);
results.push({ id: pointId, success: true });
} else {
console.error(`Error uploading document ${documentName}: ${response.statusText}`);
results.push({ id: pointId, success: false, error: response.statusText });
}
} catch (error) {
console.error(`Error uploading document ${documentName}: ${error.message}`);
results.push({ id: pointId, success: false, error: error.message });
}
// Add a small delay to prevent rate limiting
await new Promise(resolve => setTimeout(resolve, 100));
}
return results;
}
// Main migration function
async function migrateQdrantToRememberizer() {
try {
console.log('Starting migration from Qdrant to Rememberizer...');
let offset = null;
let totalMigrated = 0;
do {
const { points, nextOffset } = await fetchPointsFromQdrant(
qdrantCollectionName,
BATCH_SIZE,
offset
);
offset = nextOffset;
if (points.length === 0) {
break;
}
console.log(`Fetched ${points.length} points from Qdrant`);
const results = await uploadToRememberizer(points);
const successCount = results.filter(r => r.success).length;
totalMigrated += successCount;
console.log(`Progress: ${totalMigrated} points migrated successfully`);
} while (offset !== null);
console.log(`Migration complete! ${totalMigrated} total points migrated to Rememberizer`);
} catch (error) {
console.error('Migration failed:', error);
}
}
// Run the migration
// migrateQdrantToRememberizer();
||CODE_BLOCK||
{% endtab %}
{% endtabs %}
### Migrating from Supabase pgvector
If you're already using Supabase with pgvector, the migration to Rememberizer is particularly straightforward since both use PostgreSQL with the pgvector extension.
{% tabs %}
{% tab title="Python" %}
||CODE_BLOCK||python
import psycopg2
import requests
import json
import time
import os
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
# Supabase PostgreSQL configuration
SUPABASE_DB_HOST = os.getenv("SUPABASE_DB_HOST")
SUPABASE_DB_PORT = os.getenv("SUPABASE_DB_PORT", "5432")
SUPABASE_DB_NAME = os.getenv("SUPABASE_DB_NAME")
SUPABASE_DB_USER = os.getenv("SUPABASE_DB_USER")
SUPABASE_DB_PASSWORD = os.getenv("SUPABASE_DB_PASSWORD")
SUPABASE_VECTOR_TABLE = os.getenv("SUPABASE_VECTOR_TABLE", "documents")
# Rememberizer configuration
REMEMBERIZER_API_KEY = os.getenv("REMEMBERIZER_API_KEY")
VECTOR_STORE_ID = os.getenv("VECTOR_STORE_ID") # e.g., "vs_abc123"
BASE_URL = "https://api.rememberizer.ai/api/v1"
# Batch size for processing
BATCH_SIZE = 100
# Connect to Supabase PostgreSQL
def connect_to_supabase():
try:
conn = psycopg2.connect(
host=SUPABASE_DB_HOST,
port=SUPABASE_DB_PORT,
dbname=SUPABASE_DB_NAME,
user=SUPABASE_DB_USER,
password=SUPABASE_DB_PASSWORD
)
return conn
except Exception as e:
print(f"Error connecting to Supabase PostgreSQL: {e}")
return None
# Fetch documents from Supabase pgvector
def fetch_documents_from_supabase(conn, batch_size, offset=0):
try:
cursor = conn.cursor()
# Adjust this query based on your table structure
query = f"""
SELECT id, content, metadata, embedding
FROM {SUPABASE_VECTOR_TABLE}
ORDER BY id
LIMIT %s OFFSET %s
"""
cursor.execute(query, (batch_size, offset))
documents = cursor.fetchall()
cursor.close()
return documents
except Exception as e:
print(f"Error fetching documents from Supabase: {e}")
return []
# Upload documents to Rememberizer
def upload_to_rememberizer(documents):
headers = {
"x-api-key": REMEMBERIZER_API_KEY,
"Content-Type": "application/json"
}
results = []
for doc in documents:
doc_id, content, metadata, embedding = doc
# Parse metadata if it's stored as JSON string
if isinstance(metadata, str):
try:
metadata = json.loads(metadata)
except:
metadata = {}
elif metadata is None:
metadata = {}
document_name = metadata.get("filename", f"supabase_doc_{doc_id}")
if not content:
print(f"Skipping {doc_id} - no content found")
continue
data = {
"name": document_name,
"content": content,
"metadata": metadata
}
try:
response = requests.post(
f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents/text",
headers=headers,
json=data
)
if response.status_code == 201:
print(f"Document '{document_name}' uploaded successfully!")
results.append({"id": doc_id, "success": True})
else:
print(f"Error uploading document {document_name}: {response.text}")
results.append({"id": doc_id, "success": False, "error": response.text})
except Exception as e:
print(f"Exception uploading document {document_name}: {str(e)}")
results.append({"id": doc_id, "success": False, "error": str(e)})
# Add a small delay to prevent rate limiting
time.sleep(0.1)
return results
# Main migration function
def migrate_supabase_to_rememberizer():
conn = connect_to_supabase()
if not conn:
print("Failed to connect to Supabase. Aborting migration.")
return
offset = 0
total_migrated = 0
print("Starting migration from Supabase pgvector to Rememberizer...")
try:
while True:
documents = fetch_documents_from_supabase(conn, BATCH_SIZE, offset)
if not documents:
break
print(f"Fetched {len(documents)} documents from Supabase")
results = upload_to_rememberizer(documents)
success_count = sum(1 for r in results if r.get("success", False))
total_migrated += success_count
print(f"Progress: {total_migrated} documents migrated successfully")
offset += BATCH_SIZE
finally:
conn.close()
print(f"Migration complete! {total_migrated} total documents migrated to Rememberizer")
# Run the migration
# migrate_supabase_to_rememberizer()
||CODE_BLOCK||
{% endtab %}
{% tab title="Node.js" %}
||CODE_BLOCK||javascript
const { Pool } = require('pg');
const axios = require('axios');
require('dotenv').config();
// Supabase PostgreSQL configuration
const supabasePool = new Pool({
host: process.env.SUPABASE_DB_HOST,
port: process.env.SUPABASE_DB_PORT || 5432,
database: process.env.SUPABASE_DB_NAME,
user: process.env.SUPABASE_DB_USER,
password: process.env.SUPABASE_DB_PASSWORD,
ssl: {
rejectUnauthorized: false
}
});
const supabaseVectorTable = process.env.SUPABASE_VECTOR_TABLE || 'documents';
// Rememberizer configuration
const rememberizerApiKey = process.env.REMEMBERIZER_API_KEY;
const vectorStoreId = process.env.VECTOR_STORE_ID; // e.g., "vs_abc123"
const baseUrl = 'https://api.rememberizer.ai/api/v1';
// Batch size configuration
const BATCH_SIZE = 100;
// Fetch documents from Supabase pgvector
async function fetchDocumentsFromSupabase(batchSize, offset = 0) {
try {
// Adjust this query based on your table structure
const query = `
SELECT id, content, metadata, embedding
FROM ${supabaseVectorTable}
ORDER BY id
LIMIT $1 OFFSET $2
`;
const result = await supabasePool.query(query, [batchSize, offset]);
return result.rows;
} catch (error) {
console.error(`Error fetching documents from Supabase: ${error.message}`);
return [];
}
}
// Upload documents to Rememberizer
async function uploadToRememberizer(documents) {
const headers = {
'x-api-key': rememberizerApiKey,
'Content-Type': 'application/json'
};
const results = [];
for (const doc of documents) {
// Parse metadata if it's stored as JSON string
let metadata = doc.metadata;
if (typeof metadata === 'string') {
try {
metadata = JSON.parse(metadata);
} catch (e) {
metadata = {};
}
} else if (metadata === null) {
metadata = {};
}
const documentName = metadata.filename || `supabase_doc_${doc.id}`;
if (!doc.content) {
console.log(`Skipping ${doc.id} - no content found`);
continue;
}
const data = {
name: documentName,
content: doc.content,
metadata: metadata
};
try {
const response = await axios.post(
`${baseUrl}/vector-stores/${vectorStoreId}/documents/text`,
data,
{ headers }
);
if (response.status === 201) {
console.log(`Document '${documentName}' uploaded successfully!`);
results.push({ id: doc.id, success: true });
} else {
console.error(`Error uploading document ${documentName}: ${response.statusText}`);
results.push({ id: doc.id, success: false, error: response.statusText });
}
} catch (error) {
console.error(`Error uploading document ${documentName}: ${error.message}`);
results.push({ id: doc.id, success: false, error: error.message });
}
// Add a small delay to prevent rate limiting
await new Promise(resolve => setTimeout(resolve, 100));
}
return results;
}
// Main migration function
async function migrateSupabaseToRememberizer() {
try {
console.log('Starting migration from Supabase pgvector to Rememberizer...');
let offset = 0;
let totalMigrated = 0;
while (true) {
const documents = await fetchDocumentsFromSupabase(BATCH_SIZE, offset);
if (documents.length === 0) {
break;
}
console.log(`Fetched ${documents.length} documents from Supabase`);
const results = await uploadToRememberizer(documents);
const successCount = results.filter(r => r.success).length;
totalMigrated += successCount;
console.log(`Progress: ${totalMigrated} documents migrated successfully`);
offset += BATCH_SIZE;
}
console.log(`Migration complete! ${totalMigrated} total documents migrated to Rememberizer`);
} catch (error) {
console.error('Migration failed:', error);
} finally {
await supabasePool.end();
}
}
// Run the migration
// migrateSupabaseToRememberizer();
||CODE_BLOCK||
{% endtab %}
{% endtabs %}
### Migration Best Practices
Follow these recommendations for a successful migration:
1. **Plan Ahead**:
- Estimate the data volume and time required for migration
- Schedule migration during low-traffic periods
- Increase disk space before starting large migrations
2. **Test First**:
- Create a test vector store in Rememberizer
- Migrate a small subset of data (100-1000 vectors)
- Verify search functionality with key queries
3. **Data Validation**:
- Compare document counts before and after migration
- Run benchmark queries to ensure similar results
- Validate that metadata is correctly preserved
4. **Optimize for Performance**:
- Use batch operations for efficiency
- Consider geographic colocation of source and target databases
- Monitor API rate limits and adjust batch sizes accordingly
5. **Post-Migration Steps**:
- Verify index creation in Rememberizer
- Update application configurations to point to new vector store
- Keep source database as backup until migration is verified
For detailed API reference and endpoint documentation, visit the [vector-store](api-docs/vector-store/ "mention") page.
---
Make sure to handle the API keys securely and follow best practices for API key management.
==> developer/enterprise-integration-patterns.md <==
---
description: Architectural patterns, security considerations, and best practices for enterprise integrations with Rememberizer
type: guide
last_updated: 2025-04-03
---
# Enterprise Integration Patterns
This guide provides comprehensive information for organizations looking to integrate Rememberizer's knowledge management and semantic search capabilities into enterprise environments. It covers architectural patterns, security considerations, scalability, and best practices.
## Enterprise Integration Overview
Rememberizer offers robust enterprise integration capabilities that extend beyond basic API usage, allowing organizations to build sophisticated knowledge management systems that:
- **Scale to meet organizational needs** across departments and teams
- **Maintain security and compliance** with enterprise requirements
- **Integrate with existing systems** and workflow tools
- **Enable team-based access control** and knowledge sharing
- **Support high-volume batch operations** for document processing
## Architectural Patterns for Enterprise Integration
### 1. Multi-Tenant Knowledge Management
Organizations can implement a multi-tenant architecture to organize knowledge by teams, departments, or functions:
||CODE_BLOCK||
┌───────────────┐
│ Rememberizer│
│ Platform │
└───────┬───────┘
│
┌─────────────────┼─────────────────┐
│ │ │
┌───────▼────────┐ ┌──────▼───────┐ ┌───────▼────────┐
│ Engineering │ │ Sales │ │ Legal │
│ Knowledge Base│ │ Knowledge Base│ │ Knowledge Base │
└───────┬────────┘ └──────┬───────┘ └───────┬────────┘
│ │ │
│ │ │
┌───────▼────────┐ ┌──────▼───────┐ ┌───────▼────────┐
│ Team-specific │ │ Team-specific│ │ Team-specific │
│ Mementos │ │ Mementos │ │ Mementos │
└────────────────┘ └──────────────┘ └─────────────────┘
||CODE_BLOCK||
**Implementation Steps:**
1. Create separate vector stores for each department or major knowledge domain
2. Configure team-based access control using Rememberizer's team functionality
3. Define mementos to control access to specific knowledge subsets
4. Implement role-based permissions for knowledge administrators and consumers
### 2. Integration Hub Architecture
For enterprises with existing systems, the hub-and-spoke pattern allows Rememberizer to act as a central knowledge repository:
||CODE_BLOCK||
┌─────────────┐ ┌─────────────┐
│ CRM System │ │ ERP System │
└──────┬──────┘ └──────┬──────┘
│ │
│ │
▼ ▼
┌──────────────────────────────────────────┐
│ │
│ Enterprise Service Bus │
│ │
└────────────────────┬─────────────────────┘
│
▼
┌───────────────────┐
│ Rememberizer │
│ Knowledge Platform│
└─────────┬─────────┘
│
┌─────────────────┴────────────────┐
│ │
┌─────────▼──────────┐ ┌──────────▼────────┐
│ Internal Knowledge │ │ Customer Knowledge │
│ Base │ │ Base │
└────────────────────┘ └─────────────────────┘
||CODE_BLOCK||
**Implementation Steps:**
1. Create and configure API keys for system-to-system integration
2. Implement OAuth2 for user-based access to knowledge repositories
3. Set up ETL processes for regular knowledge synchronization
4. Use webhooks to notify external systems of knowledge updates
### 3. Microservices Architecture
For organizations adopting microservices, integrate Rememberizer as a specialized knowledge service:
||CODE_BLOCK||
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ User Service│ │ Auth Service│ │ Data Service│ │ Search UI │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │ │
└────────────────┼────────────────┼────────────────┘
│ │
▼ ▼
┌─────────────────────────────────┐
│ API Gateway │
└─────────────────┬─────────────┘
│
▼
┌───────────────────┐
│ Rememberizer │
│ Knowledge API │
└───────────────────┘
||CODE_BLOCK||
**Implementation Steps:**
1. Create dedicated service accounts for microservices integration
2. Implement JWT token-based authentication for service-to-service communication
3. Design idempotent API interactions for resilience
4. Implement circuit breakers for fault tolerance
## Enterprise Security Patterns
### Authentication & Authorization
Rememberizer supports multiple authentication methods suitable for enterprise environments:
#### 1. OAuth2 Integration
For user-based access, implement the OAuth2 authorization flow:
||CODE_BLOCK||javascript
// Step 1: Redirect users to Rememberizer authorization endpoint
function redirectToAuth() {
const authUrl = 'https://api.rememberizer.ai/oauth/authorize/';
const params = new URLSearchParams({
client_id: 'YOUR_CLIENT_ID',
redirect_uri: 'YOUR_REDIRECT_URI',
response_type: 'code',
scope: 'read write'
});
window.location.href = `${authUrl}?${params.toString()}`;
}
// Step 2: Exchange authorization code for tokens
async function exchangeCodeForTokens(code) {
const tokenUrl = 'https://api.rememberizer.ai/oauth/token/';
const response = await fetch(tokenUrl, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
client_id: 'YOUR_CLIENT_ID',
client_secret: 'YOUR_CLIENT_SECRET',
grant_type: 'authorization_code',
code: code,
redirect_uri: 'YOUR_REDIRECT_URI'
})
});
return response.json();
}
||CODE_BLOCK||
#### 2. Service Account Authentication
For system-to-system integration, use API key authentication:
||CODE_BLOCK||python
import requests
def search_knowledge_base(query, api_key):
headers = {
'X-API-Key': api_key,
'Content-Type': 'application/json'
}
payload = {
'query': query,
'num_results': 10
}
response = requests.post(
'https://api.rememberizer.ai/api/v1/search/',
headers=headers,
json=payload
)
return response.json()
||CODE_BLOCK||
#### 3. SAML and Enterprise SSO
For enterprise single sign-on integration:
1. Configure your identity provider (Okta, Azure AD, etc.) to recognize Rememberizer as a service provider
2. Set up SAML attribute mapping to match Rememberizer user attributes
3. Configure Rememberizer to delegate authentication to your identity provider
### Zero Trust Security Model
Implement a zero trust approach with Rememberizer by:
1. **Micro-segmentation**: Create separate knowledge bases with distinct access controls
2. **Continuous Verification**: Implement short-lived tokens and regular reauthentication
3. **Least Privilege**: Define fine-grained mementos that limit access to specific knowledge subsets
4. **Event Logging**: Monitor and audit all access to sensitive knowledge
## Scalability Patterns
### Batch Processing for Document Ingestion
For large-scale document ingestion, implement the batch upload pattern:
||CODE_BLOCK||python
import requests
import time
from concurrent.futures import ThreadPoolExecutor
def batch_upload_documents(files, api_key, batch_size=5):
"""
Upload documents in batches to avoid rate limits
Args:
files: List of file paths to upload
api_key: Rememberizer API key
batch_size: Number of concurrent uploads
"""
headers = {
'X-API-Key': api_key
}
results = []
# Process files in batches
with ThreadPoolExecutor(max_workers=batch_size) as executor:
for i in range(0, len(files), batch_size):
batch = files[i:i+batch_size]
futures = []
# Submit batch of uploads
for file_path in batch:
with open(file_path, 'rb') as f:
files = {'file': f}
future = executor.submit(
requests.post,
'https://api.rememberizer.ai/api/v1/documents/upload/',
headers=headers,
files=files
)
futures.append(future)
# Collect results
for future in futures:
response = future.result()
results.append(response.json())
# Rate limiting - pause between batches
if i + batch_size < len(files):
time.sleep(1)
return results
||CODE_BLOCK||
### High-Volume Search Operations
For applications requiring high-volume search:
||CODE_BLOCK||javascript
async function batchSearchWithRateLimit(queries, apiKey, options = {}) {
const {
batchSize = 5,
delayBetweenBatches = 1000,
maxRetries = 3,
retryDelay = 2000
} = options;
const results = [];
// Process queries in batches
for (let i = 0; i < queries.length; i += batchSize) {
const batch = queries.slice(i, i + batchSize);
const batchPromises = batch.map(query => searchWithRetry(query, apiKey, maxRetries, retryDelay));
// Execute batch
const batchResults = await Promise.all(batchPromises);
results.push(...batchResults);
// Apply rate limiting between batches
if (i + batchSize < queries.length) {
await new Promise(resolve => setTimeout(resolve, delayBetweenBatches));
}
}
return results;
}
async function searchWithRetry(query, apiKey, maxRetries, retryDelay) {
let retries = 0;
while (retries < maxRetries) {
try {
const response = await fetch('https://api.rememberizer.ai/api/v1/search/', {
method: 'POST',
headers: {
'X-API-Key': apiKey,
'Content-Type': 'application/json'
},
body: JSON.stringify({ query })
});
if (response.ok) {
return response.json();
}
// Handle rate limiting specifically
if (response.status === 429) {
const retryAfter = response.headers.get('Retry-After') || retryDelay / 1000;
await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
retries++;
continue;
}
// Other errors
throw new Error(`Search failed with status: ${response.status}`);
} catch (error) {
retries++;
if (retries >= maxRetries) {
throw error;
}
await new Promise(resolve => setTimeout(resolve, retryDelay));
}
}
}
||CODE_BLOCK||
## Team-Based Knowledge Management
Rememberizer supports team-based knowledge management, enabling enterprises to:
1. **Create team workspaces**: Organize knowledge by department or function
2. **Assign role-based permissions**: Control who can view, edit, or administer knowledge
3. **Share knowledge across teams**: Configure cross-team access to specific knowledge bases
### Team Roles and Permissions
Rememberizer supports the following team roles:
| Role | Capabilities |
|------|--------------|
| **Owner** | Full administrative access, can manage team members and all knowledge |
| **Admin** | Can manage knowledge and configure mementos, but cannot manage the team itself |
| **Member** | Can view and search knowledge according to memento permissions |
### Implementing Team-Based Knowledge Sharing
||CODE_BLOCK||python
import requests
def create_team_knowledge_base(team_id, name, description, api_key):
"""
Create a knowledge base for a specific team
"""
headers = {
'X-API-Key': api_key,
'Content-Type': 'application/json'
}
payload = {
'team_id': team_id,
'name': name,
'description': description
}
response = requests.post(
'https://api.rememberizer.ai/api/v1/teams/knowledge/',
headers=headers,
json=payload
)
return response.json()
def grant_team_access(knowledge_id, team_id, permission_level, api_key):
"""
Grant a team access to a knowledge base
Args:
knowledge_id: ID of the knowledge base
team_id: ID of the team to grant access
permission_level: 'read', 'write', or 'admin'
api_key: Rememberizer API key
"""
headers = {
'X-API-Key': api_key,
'Content-Type': 'application/json'
}
payload = {
'team_id': team_id,
'knowledge_id': knowledge_id,
'permission': permission_level
}
response = requests.post(
'https://api.rememberizer.ai/api/v1/knowledge/permissions/',
headers=headers,
json=payload
)
return response.json()
||CODE_BLOCK||
## Enterprise Integration Best Practices
### 1. Implement Robust Error Handling
Design your integration to handle various error scenarios gracefully:
||CODE_BLOCK||javascript
async function robustApiCall(endpoint, method, payload, apiKey) {
try {
const response = await fetch(`https://api.rememberizer.ai/api/v1/${endpoint}`, {
method,
headers: {
'X-API-Key': apiKey,
'Content-Type': 'application/json'
},
body: method !== 'GET' ? JSON.stringify(payload) : undefined
});
// Handle different response types
if (response.status === 204) {
return { success: true };
}
if (!response.ok) {
const error = await response.json();
throw new Error(error.message || `API call failed with status: ${response.status}`);
}
return await response.json();
} catch (error) {
// Log error details for troubleshooting
console.error(`API call to ${endpoint} failed:`, error);
// Provide meaningful error to calling code
throw new Error(`Failed to ${method} ${endpoint}: ${error.message}`);
}
}
||CODE_BLOCK||
### 2. Implement Caching for Frequently Accessed Knowledge
Reduce API load and improve performance with appropriate caching:
||CODE_BLOCK||python
import requests
import time
from functools import lru_cache
# Cache frequently accessed documents for 10 minutes
@lru_cache(maxsize=100)
def get_document_with_cache(document_id, api_key, timestamp=None):
"""
Get a document with caching
Args:
document_id: ID of the document to retrieve
api_key: Rememberizer API key
timestamp: Cache invalidation timestamp (default: 10 min chunks)
"""
# Generate a timestamp that changes every 10 minutes for cache invalidation
if timestamp is None:
timestamp = int(time.time() / 600)
headers = {
'X-API-Key': api_key
}
response = requests.get(
f'https://api.rememberizer.ai/api/v1/documents/{document_id}/',
headers=headers
)
return response.json()
||CODE_BLOCK||
### 3. Implement Asynchronous Processing for Document Uploads
For large document sets, implement asynchronous processing:
||CODE_BLOCK||javascript
async function uploadLargeDocument(file, apiKey) {
// Step 1: Initiate upload
const initResponse = await fetch('https://api.rememberizer.ai/api/v1/documents/upload-async/', {
method: 'POST',
headers: {
'X-API-Key': apiKey,
'Content-Type': 'application/json'
},
body: JSON.stringify({
filename: file.name,
filesize: file.size,
content_type: file.type
})
});
const { upload_id, upload_url } = await initResponse.json();
// Step 2: Upload file to the provided URL
await fetch(upload_url, {
method: 'PUT',
body: file
});
// Step 3: Monitor processing status
const processingId = await initiateProcessing(upload_id, apiKey);
return monitorProcessingStatus(processingId, apiKey);
}
async function initiateProcessing(uploadId, apiKey) {
const response = await fetch('https://api.rememberizer.ai/api/v1/documents/process/', {
method: 'POST',
headers: {
'X-API-Key': apiKey,
'Content-Type': 'application/json'
},
body: JSON.stringify({
upload_id: uploadId
})
});
const { processing_id } = await response.json();
return processing_id;
}
async function monitorProcessingStatus(processingId, apiKey, interval = 2000) {
while (true) {
const statusResponse = await fetch(`https://api.rememberizer.ai/api/v1/documents/process-status/${processingId}/`, {
headers: {
'X-API-Key': apiKey
}
});
const status = await statusResponse.json();
if (status.status === 'completed') {
return status.document_id;
} else if (status.status === 'failed') {
throw new Error(`Processing failed: ${status.error}`);
}
// Wait before checking again
await new Promise(resolve => setTimeout(resolve, interval));
}
}
||CODE_BLOCK||
### 4. Implement Proper Rate Limiting
Respect API rate limits to ensure reliable operation:
||CODE_BLOCK||python
import requests
import time
from functools import wraps
class RateLimiter:
def __init__(self, calls_per_second=5):
self.calls_per_second = calls_per_second
self.last_call_time = 0
self.min_interval = 1.0 / calls_per_second
def __call__(self, func):
@wraps(func)
def wrapper(*args, **kwargs):
current_time = time.time()
time_since_last_call = current_time - self.last_call_time
if time_since_last_call < self.min_interval:
sleep_time = self.min_interval - time_since_last_call
time.sleep(sleep_time)
self.last_call_time = time.time()
return func(*args, **kwargs)
return wrapper
# Apply rate limiting to API calls
@RateLimiter(calls_per_second=5)
def search_documents(query, api_key):
headers = {
'X-API-Key': api_key,
'Content-Type': 'application/json'
}
payload = {
'query': query
}
response = requests.post(
'https://api.rememberizer.ai/api/v1/search/',
headers=headers,
json=payload
)
return response.json()
||CODE_BLOCK||
## Compliance Considerations
### Data Residency
For organizations with data residency requirements:
1. **Choose appropriate region**: Select Rememberizer deployments in compliant regions
2. **Document data flows**: Map where knowledge is stored and processed
3. **Implement filtering**: Use mementos to restrict sensitive data access
### Audit Logging
Implement comprehensive audit logging for compliance:
||CODE_BLOCK||python
import requests
import json
import logging
# Configure logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s [%(levelname)s] %(message)s',
handlers=[
logging.FileHandler('rememberizer_audit.log'),
logging.StreamHandler()
]
)
def audit_log_api_call(endpoint, method, user_id, result_status):
"""
Log API call details for audit purposes
"""
log_entry = {
'timestamp': time.time(),
'endpoint': endpoint,
'method': method,
'user_id': user_id,
'status': result_status
}
logging.info(f"API CALL: {json.dumps(log_entry)}")
def search_with_audit(query, api_key, user_id):
endpoint = 'search'
method = 'POST'
try:
headers = {
'X-API-Key': api_key,
'Content-Type': 'application/json'
}
payload = {
'query': query
}
response = requests.post(
'https://api.rememberizer.ai/api/v1/search/',
headers=headers,
json=payload
)
status = 'success' if response.ok else 'error'
audit_log_api_call(endpoint, method, user_id, status)
return response.json()
except Exception as e:
audit_log_api_call(endpoint, method, user_id, 'exception')
raise
||CODE_BLOCK||
## Next Steps
To implement enterprise integrations with Rememberizer:
1. **Design your knowledge architecture**: Map out knowledge domains and access patterns
2. **Set up role-based team structures**: Create teams and assign appropriate permissions
3. **Implement authentication flows**: Choose and implement the authentication methods that meet your requirements
4. **Design scalable workflows**: Implement batch processing for document ingestion
5. **Establish monitoring and audit policies**: Set up logging and monitoring for compliance and operations
## Related Resources
* [Mementos Filter Access](../personal/mementos-filter-access.md) - Control which data sources are available to integrations
* [API Documentation](api-docs/README.md) - Complete API reference for all endpoints
* [LangChain Integration](langchain-integration.md) - Programmatic integration with the LangChain framework
* [Creating a Rememberizer GPT](creating-a-rememberizer-gpt.md) - Integration with OpenAI's GPT platform
* [Vector Stores](vector-stores.md) - Technical details of Rememberizer's vector database implementation
For additional assistance with enterprise integrations, contact the Rememberizer team through the Support portal.
==> developer/integration-options.md <==
---
description: Overview of developer tools and integration options for building applications with Rememberizer's semantic search capabilities
type: guide
last_updated: 2025-07-11
---
# Integration Options Overview
Developer tools and integration options for building applications with Rememberizer's semantic search and knowledge management capabilities.
## Authentication Methods
**API Key Authentication**
- **[Registering and Using API Keys](registering-and-using-api-keys.md)** - Simple authentication for accessing shared knowledge bases and building prototypes
**OAuth2 Integration**
- **[Registering Rememberizer Apps](registering-rememberizer-apps.md)** - Register applications for OAuth2 authentication
- **[Authorizing Rememberizer Apps](authorizing-rememberizer-apps.md)** - Implement OAuth2 flow for user-specific knowledge access
## Platform Integrations
**AI Platforms**
- **[Creating a Rememberizer GPT](creating-a-rememberizer-gpt.md)** - Build custom GPTs with Rememberizer knowledge access
- **[LangChain Integration](langchain-integration.md)** - Integrate with LangChain for AI workflows and document retrieval
**Data Management**
- **[Vector Stores](vector-stores.md)** - Create and manage specialized semantic search databases
**Sample Applications**
- **[Talk-to-Slack Sample Web App](talk-to-slack-the-sample-web-app.md)** - Complete example web application
## Integration Patterns
**API-First Approach** - Build applications that consume Rememberizer's RESTful APIs directly
**SDK Integration** - Use language-specific SDKs for easier integration
**Webhook Integration** - Set up real-time notifications for knowledge updates
==> developer/authorizing-rememberizer-apps.md <==
# Authorizing Rememberizer apps
Rememberizer's implementation supports the standard [authorization code grant type](https://tools.ietf.org/html/rfc6749#section-4.1).
The web application flow to authorize users for your app is as follows:
1. Users are redirected to Rememberizer to authorize their account.
2. The user chooses mementos to use with your application.
3. Your application accesses the API with the user's access token.
Visit [#explore-third-party-apps-and-service](../personal/manage-third-party-apps.md#explore-third-party-apps-and-service "mention") page to see the UI example of the flow.
Coming soon: OAuth2 Authorization Flow Diagram
This sequence diagram will illustrate the complete OAuth2 flow between:
User's browser
Your application (client)
Rememberizer authorization server
Rememberizer API resources
The diagram will show the exchange of authorization codes, tokens, and API requests across all steps of the process.
OAuth2 authorization flow sequence diagram for Rememberizer integration
### Step 1. Request a user's Rememberizer identity
Redirect the user to the Rememberizer authorization server to initiate the authentication and authorization process.
||CODE_BLOCK||
GET https://api.rememberizer.ai/api/v1/auth/oauth2/authorize/
||CODE_BLOCK||
Parameters:
name
description
client_id
Required The client ID for your application. You can find this value in the Developer. Click Developer on the top-left corner. In the list of registered apps, click on your app and you will see the client ID in App Credentials.
response_type
Required Must be code for authorization code grants.
scope
Optional
A space-delimited list of scopes that identify the resources that your application could access on the user's behalf.
redirect_uri
Required The URL in your application where users will be sent after authorization.
state
Required
An opaque value used by the client to maintain state between the request and callback. The authorization server includes this value when redirecting the user-agent back to the client.
### Step 2. User choose and config their mementos
Users will choose which mementos to use with your app.
### Step 3. Users are redirected back to your site by Rememberizer
After users select their mementos, Rememberizer redirects back to your site with a temporary `code` parameter as well as the state you provided in the previous step in a `state` parameter. The temporary code will expire after a short time. If the states don't match, a third party created the request, and you should abort the process.
### Step 4. Exchange authorization code for refresh and access tokens
||CODE_BLOCK||
POST https://api.rememberizer.ai/api/v1/auth/oauth2/token/
||CODE_BLOCK||
This endpoint takes the following input parameters.
name
description
client_id
Required The client ID for your application. You can find this value in the Developer. Instruction to find this ID is in step 1.
client_secret
Required The client secret you received from Rememberizer for your application.
code
The authorization code you received in step 3.
redirect_uri
Required The URL in your application where users are sent after authorization. Must match with the redirect_uri in step 1.
### Step 5. Use the access token to access the API
The access token allows you to make requests to the API on a user's behalf.
||CODE_BLOCK||
Authorization: Bearer OAUTH-TOKEN
GET https://api.rememberizer.ai/api/me/
||CODE_BLOCK||
For example, in curl you can set the Authorization header like this:
||CODE_BLOCK||shell
curl -H "Authorization: Bearer OAUTH-TOKEN" https://api.rememberizer.ai/api/me/
||CODE_BLOCK||
## References
Github: [https://github.com/skydeckai/rememberizer-integration-samples](https://github.com/skydeckai/rememberizer-integration-samples)
==> developer/api-documentations/retrieve-slacks-content.md <==
# Retrieve Slack's content
{% swagger src="../../.gitbook/assets/rememberizer_openapi (1).yml" path="/discussions/{discussion_id}/contents/" method="get" %}
[rememberizer_openapi (1).yml](<../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/retrieve-documents.md <==
# Retrieve documents
{% swagger src="../../.gitbook/assets/rememberizer_openapi (1).yml" path="/documents/" method="get" %}
[rememberizer_openapi (1).yml](<../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/README.md <==
# API documentations
You can authenticate APIs using either [OAuth2](../authorizing-rememberizer-apps.md) or [API keys](../registering-and-using-api-keys.md). OAuth2 is a standard authorization framework that enables applications to securely access specific documents within a system. On the other hand, API keys provide a simpler method to retrieve documents from a common knowledge base without the need to undergo the OAuth2 authentication process.
==> developer/api-documentations/list-available-data-source-integrations.md <==
# List available data source integrations
{% swagger src="../../.gitbook/assets/rememberizer_openapi (1).yml" path="/integrations/" method="get" %}
[rememberizer_openapi (1).yml](<../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/retrieve-current-users-account-details.md <==
# Retrieve current user's account details
{% swagger src="../../.gitbook/assets/rememberizer_openapi (1).yml" path="/account/" method="get" %}
[rememberizer_openapi (1).yml](<../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/memorize-content-to-rememberizer.md <==
# Memorize content to Rememberizer
{% swagger src="../../.gitbook/assets/rememberizer_openapi (1).yml" path="/documents/memorize/" method="post" %}
[rememberizer_openapi (1).yml](<../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/get-all-added-public-knowledge.md <==
# Get all added public knowledge
{% swagger src="../../.gitbook/assets/rememberizer_openapi (1).yml" path="/common_knowledge/subscribed-list/" method="get" %}
[rememberizer_openapi (1).yml](<../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/search-for-documents-by-semantic-similarity.md <==
# Search for documents by semantic similarity
{% swagger src="../../.gitbook/assets/rememberizer_openapi (1).yml" path="/documents/search/" method="get" %}
[rememberizer_openapi (1).yml](<../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/retrieve-document-contents.md <==
# Retrieve document contents
{% swagger src="../../.gitbook/assets/rememberizer_openapi (1).yml" path="/documents/{document_id}/contents/" method="get" %}
[rememberizer_openapi (1).yml](<../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/vector-store/get-a-list-of-documents-in-a-vector-store.md <==
# Get a list of documents in a Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents" method="get" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
==> developer/api-documentations/vector-store/get-the-information-of-a-document.md <==
# Get the information of a document
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/{document-id}" method="get" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
==> developer/api-documentations/vector-store/README.md <==
# Vector Store APIs
==> developer/api-documentations/vector-store/get-vector-stores-information.md <==
# Get vector store's information
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/me" method="get" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
==> developer/api-documentations/vector-store/search-for-vector-store-documents-by-semantic-similarity.md <==
# Search for Vector Store documents by semantic similarity
{% swagger src="../../../.gitbook/assets/rememberizer_openapi (1).yml" path="/vector-stores/{vector-store-id}/documents/search" method="get" %}
[rememberizer_openapi (1).yml](<../../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/vector-store/add-new-text-document-to-a-vector-store.md <==
# Add new text document to a Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/create" method="post" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
==> developer/api-documentations/vector-store/remove-a-document-in-vector-store.md <==
# Remove a document in Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/{document-id}/" method="delete" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
==> developer/api-documentations/vector-store/update-files-content-in-a-vector-store.md <==
# Update file's content in a Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/{document-id}/" method="patch" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
==> developer/api-documentations/vector-store/upload-files-to-a-vector-store.md <==
# Upload files to a Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/upload" method="post" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
==> developer/api-docs/retrieve-slacks-content.md <==
# Retrieve Slack's content
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/discussions/{discussion_id}/contents/" method="get" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
"https://api.rememberizer.ai/api/v1/discussions/12345/contents/?integration_type=slack&from=2023-06-01T00:00:00Z&to=2023-06-07T23:59:59Z" \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and `12345` with an actual discussion ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getSlackContents = async (discussionId, from = null, to = null) => {
const url = new URL(`https://api.rememberizer.ai/api/v1/discussions/${discussionId}/contents/`);
url.searchParams.append('integration_type', 'slack');
if (from) {
url.searchParams.append('from', from);
}
if (to) {
url.searchParams.append('to', to);
}
const response = await fetch(url.toString(), {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
};
// Get Slack contents for the past week
const toDate = new Date().toISOString();
const fromDate = new Date();
fromDate.setDate(fromDate.getDate() - 7);
const fromDateStr = fromDate.toISOString();
getSlackContents(12345, fromDateStr, toDate);
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and `12345` with an actual discussion ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
from datetime import datetime, timedelta
def get_slack_contents(discussion_id, from_date=None, to_date=None):
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
params = {
"integration_type": "slack"
}
if from_date:
params["from"] = from_date
if to_date:
params["to"] = to_date
response = requests.get(
f"https://api.rememberizer.ai/api/v1/discussions/{discussion_id}/contents/",
headers=headers,
params=params
)
data = response.json()
print(data)
# Get Slack contents for the past week
to_date = datetime.now().isoformat() + "Z"
from_date = (datetime.now() - timedelta(days=7)).isoformat() + "Z"
get_slack_contents(12345, from_date, to_date)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and `12345` with an actual discussion ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| discussion_id | integer | **Required.** The ID of the Slack channel or discussion to retrieve contents for. |
## Query Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| integration_type | string | **Required.** Set to "slack" for retrieving Slack content. |
| from | string | Starting time in ISO 8601 format at GMT+0. If not specified, the default is now. |
| to | string | Ending time in ISO 8601 format at GMT+0. If not specified, it's 7 days before the "from" parameter. |
## Response Format
||CODE_BLOCK||json
{
"discussion_content": "User A [2023-06-01 10:30:00]: Good morning team!\nUser B [2023-06-01 10:32:15]: Morning! How's everyone doing today?\n...",
"thread_contents": {
"2023-06-01T10:30:00Z": "User C [2023-06-01 10:35:00]: @User A I'm doing great, thanks for asking!\nUser A [2023-06-01 10:37:30]: Glad to hear that @User C!",
"2023-06-02T14:15:22Z": "User D [2023-06-02 14:20:45]: Here's the update on the project...\nUser B [2023-06-02 14:25:10]: Thanks for the update!"
}
}
||CODE_BLOCK||
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 404 | Discussion not found |
| 500 | Internal server error |
This endpoint retrieves the contents of a Slack channel or direct message conversation. It returns both the main channel messages (`discussion_content`) and threaded replies (`thread_contents`). The data is organized chronologically and includes user information, making it easy to understand the context of conversations.
The time range parameters allow you to focus on specific periods, which is particularly useful for reviewing recent activity or historical discussions.
==> developer/api-docs/retrieve-documents.md <==
# Retrieve documents
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/documents/" method="get" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
"https://api.rememberizer.ai/api/v1/documents/?page=1&page_size=20&integration_type=google_drive" \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getDocuments = async (page = 1, pageSize = 20, integrationType = 'google_drive') => {
const url = new URL('https://api.rememberizer.ai/api/v1/documents/');
url.searchParams.append('page', page);
url.searchParams.append('page_size', pageSize);
if (integrationType) {
url.searchParams.append('integration_type', integrationType);
}
const response = await fetch(url.toString(), {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
};
getDocuments();
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def get_documents(page=1, page_size=20, integration_type=None):
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
params = {
"page": page,
"page_size": page_size
}
if integration_type:
params["integration_type"] = integration_type
response = requests.get(
"https://api.rememberizer.ai/api/v1/documents/",
headers=headers,
params=params
)
data = response.json()
print(data)
get_documents(integration_type="google_drive")
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% endtabs %}
## Request Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| page | integer | Page number for pagination. Default is 1. |
| page_size | integer | Number of items per page. Default is 10. |
| integration_type | string | Filter documents by integration type. Options: google_drive, slack, dropbox, gmail, common_knowledge |
## Response Format
||CODE_BLOCK||json
{
"count": 257,
"next": "https://api.rememberizer.ai/api/v1/documents/?page=2&page_size=20&integration_type=google_drive",
"previous": null,
"results": [
{
"document_id": "1aBcD2efGhIjK3lMnOpQrStUvWxYz",
"name": "Project Proposal.docx",
"type": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"path": "/Documents/Projects/Proposal.docx",
"url": "https://drive.google.com/file/d/1aBcD2efGhIjK3lMnOpQrStUvWxYz/view",
"id": 12345,
"integration_type": "google_drive",
"source": "user@example.com",
"status": "indexed",
"indexed_on": "2023-06-15T10:30:00Z",
"size": 250000
},
// ... more documents
]
}
||CODE_BLOCK||
## Available Integration Types
| Integration Type | Description |
|-----------------|-------------|
| google_drive | Documents from Google Drive |
| slack | Messages and files from Slack |
| dropbox | Files from Dropbox |
| gmail | Emails from Gmail |
| common_knowledge | Public knowledge sources |
This endpoint retrieves a list of documents from your connected data sources. You can filter by integration type to focus on specific sources.
==> developer/api-docs/mementos.md <==
# Mementos APIs
Mementos allow users to define collections of documents that can be accessed by applications. This document outlines the available Memento APIs.
## List Mementos
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/mementos/" method="get" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
https://api.rememberizer.ai/api/v1/mementos/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const fetchMementos = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/mementos/', {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
};
fetchMementos();
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def fetch_mementos():
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
response = requests.get(
"https://api.rememberizer.ai/api/v1/mementos/",
headers=headers
)
data = response.json()
print(data)
fetch_mementos()
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% endtabs %}
## Create Memento
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/mementos/" method="post" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/mementos/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{"name": "Work Documents"}'
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const createMemento = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/mementos/', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN',
'Content-Type': 'application/json'
},
body: JSON.stringify({
name: 'Work Documents'
})
});
const data = await response.json();
console.log(data);
};
createMemento();
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
def create_memento():
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN",
"Content-Type": "application/json"
}
payload = {
"name": "Work Documents"
}
response = requests.post(
"https://api.rememberizer.ai/api/v1/mementos/",
headers=headers,
data=json.dumps(payload)
)
data = response.json()
print(data)
create_memento()
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% endtabs %}
## Get Memento Details
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/mementos/{id}/" method="get" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
https://api.rememberizer.ai/api/v1/mementos/123/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and `123` with an actual memento ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getMementoDetails = async (mementoId) => {
const response = await fetch(`https://api.rememberizer.ai/api/v1/mementos/${mementoId}/`, {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
};
getMementoDetails(123);
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and `123` with an actual memento ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def get_memento_details(memento_id):
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
response = requests.get(
f"https://api.rememberizer.ai/api/v1/mementos/{memento_id}/",
headers=headers
)
data = response.json()
print(data)
get_memento_details(123)
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and `123` with an actual memento ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Manage Memento Documents
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/mementos/memento_document/{memento_id}/" method="post" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/mementos/memento_document/123/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"memento": "123",
"add": ["document_id_1", "document_id_2"],
"folder_add": ["folder_id_1"],
"remove": ["document_id_3"]
}'
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and use actual document and folder IDs.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const manageMementoDocuments = async (mementoId) => {
const response = await fetch(`https://api.rememberizer.ai/api/v1/mementos/memento_document/${mementoId}/`, {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN',
'Content-Type': 'application/json'
},
body: JSON.stringify({
memento: mementoId,
add: ["document_id_1", "document_id_2"],
folder_add: ["folder_id_1"],
remove: ["document_id_3"]
})
});
const data = await response.json();
console.log(data);
};
manageMementoDocuments(123);
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and use actual document and folder IDs.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
def manage_memento_documents(memento_id):
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN",
"Content-Type": "application/json"
}
payload = {
"memento": memento_id,
"add": ["document_id_1", "document_id_2"],
"folder_add": ["folder_id_1"],
"remove": ["document_id_3"]
}
response = requests.post(
f"https://api.rememberizer.ai/api/v1/mementos/memento_document/{memento_id}/",
headers=headers,
data=json.dumps(payload)
)
data = response.json()
print(data)
manage_memento_documents(123)
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and use actual document and folder IDs.
{% endhint %}
{% endtab %}
{% endtabs %}
## Delete Memento
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/mementos/{id}/" method="delete" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X DELETE \
https://api.rememberizer.ai/api/v1/mementos/123/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and `123` with an actual memento ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const deleteMemento = async (mementoId) => {
const response = await fetch(`https://api.rememberizer.ai/api/v1/mementos/${mementoId}/`, {
method: 'DELETE',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
if (response.status === 204) {
console.log("Memento deleted successfully");
} else {
console.error("Failed to delete memento");
}
};
deleteMemento(123);
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and `123` with an actual memento ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def delete_memento(memento_id):
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
response = requests.delete(
f"https://api.rememberizer.ai/api/v1/mementos/{memento_id}/",
headers=headers
)
if response.status_code == 204:
print("Memento deleted successfully")
else:
print("Failed to delete memento")
delete_memento(123)
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and `123` with an actual memento ID.
{% endhint %}
{% endtab %}
{% endtabs %}
==> developer/api-docs/retrieve-current-user-account-details.md <==
# Retrieve current user's account details
This endpoint allows you to retrieve the details of the currently authenticated user's account.
## Endpoint
||CODE_BLOCK||
GET /api/v1/account/
||CODE_BLOCK||
## Authentication
This endpoint requires authentication using a JWT token.
## Request
No request parameters are required.
## Response
||CODE_BLOCK||json
{
"id": "user_id",
"email": "user@example.com",
"name": "User Name"
}
||CODE_BLOCK||
## User Profile (Extended Information)
For more detailed user profile information, you can use:
||CODE_BLOCK||
GET /api/v1/me/
||CODE_BLOCK||
### Extended Response
||CODE_BLOCK||json
{
"id": "username",
"email": "user@example.com",
"name": "User Name",
"user_onboarding_status": 7,
"dev_onboarding_status": 3,
"company_name": "Company",
"website": "https://example.com",
"bio": "User bio",
"team": [
{
"id": "team_id",
"name": "Team Name",
"image_url": "url",
"role": "admin"
}
],
"embed_quota": 10000,
"current_usage": 500,
"email_verified": true
}
||CODE_BLOCK||
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 401 | Unauthorized - Invalid or missing authentication credentials |
| 403 | Forbidden - User does not have permission to access this resource |
| 500 | Internal Server Error - Something went wrong on the server |
## Usage Example
### Using cURL
||CODE_BLOCK||bash
curl -H "Authorization: Bearer YOUR_JWT_TOKEN" https://api.rememberizer.ai/api/v1/account/
||CODE_BLOCK||
### Using JavaScript
||CODE_BLOCK||javascript
const response = await fetch('https://api.rememberizer.ai/api/v1/account/', {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
||CODE_BLOCK||
### Using Python
||CODE_BLOCK||python
import requests
headers = {"Authorization": "Bearer YOUR_JWT_TOKEN"}
response = requests.get("https://api.rememberizer.ai/api/v1/account/", headers=headers)
data = response.json()
print(data)
||CODE_BLOCK||
==> developer/api-docs/README.md <==
# API Documentation
You can authenticate APIs using either [OAuth2](../authorizing-rememberizer-apps.md) or [API keys](../registering-and-using-api-keys.md). OAuth2 is a standard authorization framework that enables applications to securely access specific documents within a system. On the other hand, API keys provide a simpler method to retrieve documents from a common knowledge base without the need to undergo the OAuth2 authentication process.
## API Overview
Rememberizer provides a comprehensive set of APIs for working with documents, vector stores, mementos, and more. The APIs are organized into the following categories:
### Authentication APIs
- Sign Up, Sign In, Email Verification
- Password Reset
- OAuth Endpoints (Google, Microsoft)
- Token Management and Logout
### User APIs
- User Profile and Account Information
- User Onboarding
### Document APIs
- List, Create, and Update Documents
- Document Processing
- Batch Document Operations
### Search APIs
- Basic Search
- Agentic Search
- Batch Search Operations
### Mementos APIs
- Create, List, Update, and Delete Mementos
- Manage Memento Documents
### Vector Stores APIs
- Create and List Vector Stores
- Upload Text and File Documents
- Search Vector Stores
- Batch Upload and Search
### Integrations APIs
- List Integrations
- OAuth Integration Endpoints (Google Drive, Gmail, Slack, Dropbox)
### Applications APIs
- List and Create Applications
### Common Knowledge APIs
- List and Create Common Knowledge Items
### Team APIs
- Team Management
- Team Members
- Role-Based Permissions
For enterprise integration patterns, security considerations, and architectural best practices, see the [Enterprise Integration Patterns](../enterprise-integration-patterns.md) guide.
## Base URL
All API endpoints are relative to:
||CODE_BLOCK||
https://api.rememberizer.ai/api/v1/
||CODE_BLOCK||
## Authentication
Endpoints require authentication using either:
- JWT token (passed in Authorization header or cookies)
- API key (passed in x-api-key header)
For detailed information about specific endpoints, refer to the individual API documentation pages.
==> developer/api-docs/list-available-data-source-integrations.md <==
# List available data source integrations
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/integrations/" method="get" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
https://api.rememberizer.ai/api/v1/integrations/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getIntegrations = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/integrations/', {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
};
getIntegrations();
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def get_integrations():
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
response = requests.get(
"https://api.rememberizer.ai/api/v1/integrations/",
headers=headers
)
data = response.json()
print(data)
get_integrations()
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% endtabs %}
## Response Format
||CODE_BLOCK||json
{
"data": [
{
"id": 101,
"integration_type": "google_drive",
"integration_step": "authorized",
"source": "user@example.com",
"document_type": "drive",
"document_stats": {
"status": {
"indexed": 250,
"indexing": 5,
"error": 2
},
"total_size": 15000000,
"document_count": 257
},
"consent_time": "2023-06-15T10:30:00Z",
"memory_config": null,
"token_validity": true
},
{
"id": 102,
"integration_type": "slack",
"integration_step": "authorized",
"source": "workspace-name",
"document_type": "channel",
"document_stats": {
"status": {
"indexed": 45,
"indexing": 0,
"error": 0
},
"total_size": 5000000,
"document_count": 45
},
"consent_time": "2023-06-16T14:45:00Z",
"memory_config": null,
"token_validity": true
}
],
"message": "Integrations retrieved successfully",
"code": "success"
}
||CODE_BLOCK||
This endpoint retrieves a list of all available data source integrations for the current user. The response includes detailed information about each integration, including the integration type, status, and document statistics.
==> developer/api-docs/memorize-content-to-rememberizer.md <==
# Memorize content to Rememberizer
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/documents/memorize/" method="post" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/documents/memorize/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Important Information",
"content": "This is important content that I want Rememberizer to remember."
}'
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const memorizeContent = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/documents/memorize/', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN',
'Content-Type': 'application/json'
},
body: JSON.stringify({
name: 'Important Information',
content: 'This is important content that I want Rememberizer to remember.'
})
});
if (response.status === 201) {
console.log("Content stored successfully");
} else {
console.error("Failed to store content");
const errorData = await response.json();
console.error(errorData);
}
};
memorizeContent();
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
def memorize_content():
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN",
"Content-Type": "application/json"
}
payload = {
"name": "Important Information",
"content": "This is important content that I want Rememberizer to remember."
}
response = requests.post(
"https://api.rememberizer.ai/api/v1/documents/memorize/",
headers=headers,
data=json.dumps(payload)
)
if response.status_code == 201:
print("Content stored successfully")
else:
print(f"Failed to store content: {response.text}")
memorize_content()
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% endtabs %}
## Request Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| name | string | **Required.** A name for the content being stored. |
| content | string | **Required.** The text content to store in Rememberizer. |
## Response
A successful request returns a 201 Created status code with no response body.
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 400 | Bad Request - Missing required fields or invalid parameters |
| 401 | Unauthorized - Invalid or missing authentication |
| 500 | Internal Server Error |
## Use Cases
This endpoint is particularly useful for:
1. Storing important notes or information that you want to access later
2. Adding content that isn't available through integrated data sources
3. Manually adding information that needs to be searchable
4. Adding contextual information for LLMs accessing your knowledge base
The stored content becomes searchable through the search endpoints and can be included in mementos.
==> developer/api-docs/authentication.md <==
# Authentication APIs
Rememberizer provides several authentication endpoints to manage user accounts and sessions. This document outlines the available authentication APIs.
## Sign Up
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/auth/signup/" method="post" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/auth/signup/ \
-H "Content-Type: application/json" \
-d '{
"email": "user@example.com",
"password": "secure_password",
"name": "John Doe",
"captcha": "recaptcha_response"
}'
||CODE_BLOCK||
{% hint style="info" %}
Replace `recaptcha_response` with an actual reCAPTCHA response.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const signUp = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/auth/signup/', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
email: 'user@example.com',
password: 'secure_password',
name: 'John Doe',
captcha: 'recaptcha_response'
})
});
const data = await response.json();
console.log(data);
};
signUp();
||CODE_BLOCK||
{% hint style="info" %}
Replace `recaptcha_response` with an actual reCAPTCHA response.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
def sign_up():
headers = {
"Content-Type": "application/json"
}
payload = {
"email": "user@example.com",
"password": "secure_password",
"name": "John Doe",
"captcha": "recaptcha_response"
}
response = requests.post(
"https://api.rememberizer.ai/api/v1/auth/signup/",
headers=headers,
data=json.dumps(payload)
)
data = response.json()
print(data)
sign_up()
||CODE_BLOCK||
{% hint style="info" %}
Replace `recaptcha_response` with an actual reCAPTCHA response.
{% endhint %}
{% endtab %}
{% endtabs %}
## Sign In
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/auth/signin/" method="post" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/auth/signin/ \
-H "Content-Type: application/json" \
-d '{
"login": "user@example.com",
"password": "secure_password",
"captcha": "recaptcha_response"
}'
||CODE_BLOCK||
{% hint style="info" %}
Replace `recaptcha_response` with an actual reCAPTCHA response.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const signIn = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/auth/signin/', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
login: 'user@example.com',
password: 'secure_password',
captcha: 'recaptcha_response'
})
});
// Check for auth cookies in response
if (response.status === 204) {
console.log("Login successful!");
} else {
console.error("Login failed!");
}
};
signIn();
||CODE_BLOCK||
{% hint style="info" %}
Replace `recaptcha_response` with an actual reCAPTCHA response.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
def sign_in():
headers = {
"Content-Type": "application/json"
}
payload = {
"login": "user@example.com",
"password": "secure_password",
"captcha": "recaptcha_response"
}
response = requests.post(
"https://api.rememberizer.ai/api/v1/auth/signin/",
headers=headers,
data=json.dumps(payload)
)
if response.status_code == 204:
print("Login successful!")
else:
print("Login failed!")
sign_in()
||CODE_BLOCK||
{% hint style="info" %}
Replace `recaptcha_response` with an actual reCAPTCHA response.
{% endhint %}
{% endtab %}
{% endtabs %}
## Email Verification
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/auth/verify-email/" method="post" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/auth/verify-email/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"verification_code": "123456"
}'
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and use the verification code sent to your email.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const verifyEmail = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/auth/verify-email/', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN',
'Content-Type': 'application/json'
},
body: JSON.stringify({
verification_code: '123456'
})
});
if (response.status === 200) {
console.log("Email verification successful!");
} else {
console.error("Email verification failed!");
}
};
verifyEmail();
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and use the verification code sent to your email.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
def verify_email():
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN",
"Content-Type": "application/json"
}
payload = {
"verification_code": "123456"
}
response = requests.post(
"https://api.rememberizer.ai/api/v1/auth/verify-email/",
headers=headers,
data=json.dumps(payload)
)
if response.status_code == 200:
print("Email verification successful!")
else:
print("Email verification failed!")
verify_email()
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and use the verification code sent to your email.
{% endhint %}
{% endtab %}
{% endtabs %}
## Token Management
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/auth/custom-refresh/" method="post" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/auth/custom-refresh/ \
-b "refresh_token=YOUR_REFRESH_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
This endpoint uses cookies for authentication. The refresh token should be sent as a cookie.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const refreshToken = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/auth/custom-refresh/', {
method: 'POST',
credentials: 'include' // This includes cookies in the request
});
if (response.status === 204) {
console.log("Token refreshed successfully!");
} else {
console.error("Token refresh failed!");
}
};
refreshToken();
||CODE_BLOCK||
{% hint style="info" %}
This endpoint uses cookies for authentication. Make sure your application includes credentials in the request.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def refresh_token():
cookies = {
"refresh_token": "YOUR_REFRESH_TOKEN"
}
response = requests.post(
"https://api.rememberizer.ai/api/v1/auth/custom-refresh/",
cookies=cookies
)
if response.status_code == 204:
print("Token refreshed successfully!")
else:
print("Token refresh failed!")
refresh_token()
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_REFRESH_TOKEN` with your actual refresh token.
{% endhint %}
{% endtab %}
{% endtabs %}
## Logout
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/auth/custom-logout/" method="post" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/auth/custom-logout/
||CODE_BLOCK||
{% hint style="info" %}
This endpoint will clear the authentication cookies.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const logout = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/auth/custom-logout/', {
method: 'POST',
credentials: 'include' // This includes cookies in the request
});
if (response.status === 204) {
console.log("Logout successful!");
} else {
console.error("Logout failed!");
}
};
logout();
||CODE_BLOCK||
{% hint style="info" %}
This endpoint uses cookies for authentication. Make sure your application includes credentials in the request.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def logout():
session = requests.Session()
response = session.post(
"https://api.rememberizer.ai/api/v1/auth/custom-logout/"
)
if response.status_code == 204:
print("Logout successful!")
else:
print("Logout failed!")
logout()
||CODE_BLOCK||
{% hint style="info" %}
This endpoint will clear the authentication cookies.
{% endhint %}
{% endtab %}
{% endtabs %}
==> developer/api-docs/get-all-added-public-knowledge.md <==
# Get all added public knowledge
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/common-knowledge/subscribed-list/" method="get" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
https://api.rememberizer.ai/api/v1/common-knowledge/subscribed-list/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getPublicKnowledge = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/common-knowledge/subscribed-list/', {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
};
getPublicKnowledge();
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def get_public_knowledge():
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
response = requests.get(
"https://api.rememberizer.ai/api/v1/common-knowledge/subscribed-list/",
headers=headers
)
data = response.json()
print(data)
get_public_knowledge()
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% endtabs %}
## Response Format
||CODE_BLOCK||json
[
{
"id": 1,
"num_of_subscribers": 76,
"publisher_name": "Rememberizer AI",
"published_by_me": false,
"subscribed_by_me": true,
"size": 66741,
"created": "2023-01-15T14:30:00Z",
"modified": "2023-05-20T09:45:12Z",
"priority_score": 2.053,
"name": "Rememberizer Docs",
"image_url": "https://example.com/images/rememberizer-docs.png",
"description": "The latest documentation and blog posts about Rememberizer.",
"api_key": null,
"is_sharing": true,
"memento": 159,
"document_ids": [1234, 5678, 9012]
}
]
||CODE_BLOCK||
This endpoint retrieves a list of all public knowledge (also known as common knowledge) that the current user has subscribed to. Each item includes metadata about the knowledge source, such as publication date, size, and associated documents.
==> developer/api-docs/search-for-documents-by-semantic-similarity.md <==
---
description: Semantic search endpoint with batch processing capabilities
type: api
last_updated: 2025-04-03
---
# Search for documents by semantic similarity
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/documents/search/" method="get" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
"https://api.rememberizer.ai/api/v1/documents/search/?q=How%20to%20integrate%20Rememberizer%20with%20custom%20applications&n=5&from=2023-01-01T00:00:00Z&to=2023-12-31T23:59:59Z" \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const searchDocuments = async (query, numResults = 5, from = null, to = null) => {
const url = new URL('https://api.rememberizer.ai/api/v1/documents/search/');
url.searchParams.append('q', query);
url.searchParams.append('n', numResults);
if (from) {
url.searchParams.append('from', from);
}
if (to) {
url.searchParams.append('to', to);
}
const response = await fetch(url.toString(), {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
};
searchDocuments('How to integrate Rememberizer with custom applications', 5);
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def search_documents(query, num_results=5, from_date=None, to_date=None):
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
params = {
"q": query,
"n": num_results
}
if from_date:
params["from"] = from_date
if to_date:
params["to"] = to_date
response = requests.get(
"https://api.rememberizer.ai/api/v1/documents/search/",
headers=headers,
params=params
)
data = response.json()
print(data)
search_documents("How to integrate Rememberizer with custom applications", 5)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="Ruby" %}
||CODE_BLOCK||ruby
require 'net/http'
require 'uri'
require 'json'
def search_documents(query, num_results=5, from_date=nil, to_date=nil)
uri = URI('https://api.rememberizer.ai/api/v1/documents/search/')
params = {
q: query,
n: num_results
}
params[:from] = from_date if from_date
params[:to] = to_date if to_date
uri.query = URI.encode_www_form(params)
request = Net::HTTP::Get.new(uri)
request['Authorization'] = 'Bearer YOUR_JWT_TOKEN'
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
response = http.request(request)
data = JSON.parse(response.body)
puts data
end
search_documents("How to integrate Rememberizer with custom applications", 5)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% endtabs %}
## Query Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| q | string | **Required.** The search query text (up to 400 words). |
| n | integer | Number of results to return. Default: 3. Use higher values (e.g., 10) for more comprehensive results. |
| from | string | Start of the time range for documents to be searched, in ISO 8601 format. |
| to | string | End of the time range for documents to be searched, in ISO 8601 format. |
| prev_chunks | integer | Number of preceding chunks to include for context. Default: 2. |
| next_chunks | integer | Number of following chunks to include for context. Default: 2. |
## Response Format
||CODE_BLOCK||json
{
"data_sources": [
{
"name": "Google Drive",
"documents": 3
},
{
"name": "Slack",
"documents": 2
}
],
"matched_chunks": [
{
"document": {
"id": 12345,
"document_id": "1aBcD2efGhIjK3lMnOpQrStUvWxYz",
"name": "Rememberizer API Documentation.pdf",
"type": "application/pdf",
"path": "/Documents/Rememberizer/API Documentation.pdf",
"url": "https://drive.google.com/file/d/1aBcD2efGhIjK3lMnOpQrStUvWxYz/view",
"size": 250000,
"created_time": "2023-05-10T14:30:00Z",
"modified_time": "2023-06-15T09:45:00Z",
"indexed_on": "2023-06-15T10:30:00Z",
"integration": {
"id": 101,
"integration_type": "google_drive"
}
},
"matched_content": "To integrate Rememberizer with custom applications, you can use the OAuth2 authentication flow to authorize your application to access a user's Rememberizer data. Once authorized, your application can use the Rememberizer APIs to search for documents, retrieve content, and more.",
"distance": 0.123
},
// ... more matched chunks
],
"message": "Search completed successfully",
"code": "success"
}
||CODE_BLOCK||
## Search Optimization Tips
### For Question Answering
When searching for an answer to a question, try formulating your query as if it were an ideal answer. For example:
Instead of: "What is vector embedding?"
Try: "Vector embedding is a technique that converts text into numerical vectors in a high-dimensional space."
{% hint style="info" %}
For a deeper understanding of how vector embeddings work and why this search approach is effective, see [What are Vector Embeddings and Vector Databases?](../../background/what-are-vector-embeddings-and-vector-databases.md)
{% endhint %}
### Adjusting Result Count
- Start with `n=3` for quick, high-relevance results
- Increase to `n=10` or higher for more comprehensive information
- If search returns insufficient information, try increasing the `n` parameter
### Time-Based Filtering
Use the `from` and `to` parameters to focus on documents from specific time periods:
- Recent documents: Set `from` to a recent date
- Historical analysis: Specify a specific date range
- Excluding outdated information: Set an appropriate `to` date
## Batch Operations
For efficiently handling large volumes of search queries, Rememberizer supports batch operations to optimize performance and reduce API call overhead.
### Batch Search
{% tabs %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import time
import json
from concurrent.futures import ThreadPoolExecutor
def batch_search_documents(queries, num_results=5, batch_size=10):
"""
Perform batch searches with multiple queries
Args:
queries: List of search query strings
num_results: Number of results to return per query
batch_size: Number of queries to process in parallel
Returns:
List of search results for each query
"""
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN",
"Content-Type": "application/json"
}
results = []
# Process queries in batches
for i in range(0, len(queries), batch_size):
batch = queries[i:i+batch_size]
# Create a thread pool to send requests in parallel
with ThreadPoolExecutor(max_workers=batch_size) as executor:
futures = []
for query in batch:
params = {
"q": query,
"n": num_results
}
future = executor.submit(
requests.get,
"https://api.rememberizer.ai/api/v1/documents/search/",
headers=headers,
params=params
)
futures.append(future)
# Collect results as they complete
for future in futures:
response = future.result()
results.append(response.json())
# Rate limiting - pause between batches to avoid API throttling
if i + batch_size < len(queries):
time.sleep(1)
return results
# Example usage
queries = [
"How to use OAuth with Rememberizer",
"Vector database configuration options",
"Best practices for semantic search",
# Add more queries as needed
]
results = batch_search_documents(queries, num_results=3, batch_size=5)
||CODE_BLOCK||
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
/**
* Perform batch searches with multiple queries
*
* @param {string[]} queries - List of search query strings
* @param {number} numResults - Number of results to return per query
* @param {number} batchSize - Number of queries to process in parallel
* @param {number} delayBetweenBatches - Milliseconds to wait between batches
* @returns {Promise} - List of search results for each query
*/
async function batchSearchDocuments(queries, numResults = 5, batchSize = 10, delayBetweenBatches = 1000) {
const results = [];
// Process queries in batches
for (let i = 0; i < queries.length; i += batchSize) {
const batch = queries.slice(i, i + batchSize);
// Create an array of promises for concurrent requests
const batchPromises = batch.map(query => {
const url = new URL('https://api.rememberizer.ai/api/v1/documents/search/');
url.searchParams.append('q', query);
url.searchParams.append('n', numResults);
return fetch(url.toString(), {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
}).then(response => response.json());
});
// Wait for all requests in the batch to complete
const batchResults = await Promise.all(batchPromises);
results.push(...batchResults);
// Rate limiting - pause between batches to avoid API throttling
if (i + batchSize < queries.length) {
await new Promise(resolve => setTimeout(resolve, delayBetweenBatches));
}
}
return results;
}
// Example usage
const queries = [
"How to use OAuth with Rememberizer",
"Vector database configuration options",
"Best practices for semantic search",
// Add more queries as needed
];
batchSearchDocuments(queries, 3, 5)
.then(results => console.log(results))
.catch(error => console.error('Error in batch search:', error));
||CODE_BLOCK||
{% endtab %}
{% tab title="Ruby" %}
||CODE_BLOCK||ruby
require 'net/http'
require 'uri'
require 'json'
require 'concurrent'
# Perform batch searches with multiple queries
#
# @param queries [Array] List of search query strings
# @param num_results [Integer] Number of results to return per query
# @param batch_size [Integer] Number of queries to process in parallel
# @param delay_between_batches [Float] Seconds to wait between batches
# @return [Array] List of search results for each query
def batch_search_documents(queries, num_results = 5, batch_size = 10, delay_between_batches = 1.0)
results = []
# Process queries in batches
queries.each_slice(batch_size).with_index do |batch, batch_index|
# Create a thread pool for concurrent requests
pool = Concurrent::FixedThreadPool.new(batch_size)
futures = []
batch.each do |query|
futures << Concurrent::Future.execute(executor: pool) do
uri = URI('https://api.rememberizer.ai/api/v1/documents/search/')
params = {
q: query,
n: num_results
}
uri.query = URI.encode_www_form(params)
request = Net::HTTP::Get.new(uri)
request['Authorization'] = 'Bearer YOUR_JWT_TOKEN'
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
response = http.request(request)
JSON.parse(response.body)
end
end
# Collect results from all threads
batch_results = futures.map(&:value)
results.concat(batch_results)
# Rate limiting - pause between batches to avoid API throttling
if batch_index < (queries.length / batch_size.to_f).ceil - 1
sleep(delay_between_batches)
end
end
pool.shutdown
results
end
# Example usage
queries = [
"How to use OAuth with Rememberizer",
"Vector database configuration options",
"Best practices for semantic search",
# Add more queries as needed
]
results = batch_search_documents(queries, 3, 5)
puts results
||CODE_BLOCK||
{% endtab %}
{% endtabs %}
### Performance Considerations
When implementing batch operations, consider these best practices:
1. **Optimal Batch Size**: Start with batch sizes of 5-10 queries and adjust based on your application's performance characteristics.
2. **Rate Limiting**: Include delays between batches to prevent API throttling. A good starting point is 1 second between batches.
3. **Error Handling**: Implement robust error handling to manage failed requests within batches.
4. **Resource Management**: Monitor client-side resource usage, particularly with large batch sizes, to prevent excessive memory consumption.
5. **Response Processing**: Process batch results asynchronously when possible to improve user experience.
For high-volume applications, consider implementing a queue system to manage large numbers of search requests efficiently.
This endpoint provides powerful semantic search capabilities across your entire knowledge base. It uses vector embeddings to find content based on meaning rather than exact keyword matches.
==> developer/api-docs/retrieve-document-contents.md <==
# Retrieve document contents
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/documents/{document_id}/contents/" method="get" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
"https://api.rememberizer.ai/api/v1/documents/12345/contents/?start_chunk=0&end_chunk=20" \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and `12345` with an actual document ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getDocumentContents = async (documentId, startChunk = 0, endChunk = 20) => {
const url = new URL(`https://api.rememberizer.ai/api/v1/documents/${documentId}/contents/`);
url.searchParams.append('start_chunk', startChunk);
url.searchParams.append('end_chunk', endChunk);
const response = await fetch(url.toString(), {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
// If there are more chunks, you can fetch them
if (data.end_chunk < totalChunks) {
// Fetch the next set of chunks
await getDocumentContents(documentId, data.end_chunk, data.end_chunk + 20);
}
};
getDocumentContents(12345);
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and `12345` with an actual document ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def get_document_contents(document_id, start_chunk=0, end_chunk=20):
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
params = {
"start_chunk": start_chunk,
"end_chunk": end_chunk
}
response = requests.get(
f"https://api.rememberizer.ai/api/v1/documents/{document_id}/contents/",
headers=headers,
params=params
)
data = response.json()
print(data)
# If there are more chunks, you can fetch them
# This is a simplistic example - you might want to implement a proper recursion check
if 'end_chunk' in data and data['end_chunk'] < total_chunks:
get_document_contents(document_id, data['end_chunk'], data['end_chunk'] + 20)
get_document_contents(12345)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and `12345` with an actual document ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| document_id | integer | **Required.** The ID of the document to retrieve contents for. |
## Query Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| start_chunk | integer | The starting chunk index. Default is 0. |
| end_chunk | integer | The ending chunk index. Default is start_chunk + 20. |
## Response Format
||CODE_BLOCK||json
{
"content": "The full text content of the document chunks...",
"end_chunk": 20
}
||CODE_BLOCK||
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 404 | Document not found |
| 500 | Internal server error |
## Pagination for Large Documents
For large documents, the content is split into chunks. You can retrieve the full document by making multiple requests:
1. Make an initial request with `start_chunk=0`
2. Use the returned `end_chunk` value as the `start_chunk` for the next request
3. Continue until you have retrieved all chunks
This endpoint returns the raw text content of a document, allowing you to access the full information for detailed processing or analysis.
==> developer/api-docs/vector-store/get-a-list-of-documents-in-a-vector-store.md <==
# Get a list of documents in a Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents" method="get" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
https://api.rememberizer.ai/api/v1/vector-stores/vs_abc123/documents \
-H "x-api-key: YOUR_API_KEY"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getVectorStoreDocuments = async (vectorStoreId) => {
const response = await fetch(`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents`, {
method: 'GET',
headers: {
'x-api-key': 'YOUR_API_KEY'
}
});
const data = await response.json();
console.log(data);
};
getVectorStoreDocuments('vs_abc123');
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def get_vector_store_documents(vector_store_id):
headers = {
"x-api-key": "YOUR_API_KEY"
}
response = requests.get(
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents",
headers=headers
)
data = response.json()
print(data)
get_vector_store_documents('vs_abc123')
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| vector-store-id | string | **Required.** The ID of the vector store to list documents from. |
## Response Format
||CODE_BLOCK||json
[
{
"id": 1234,
"name": "Product Manual.pdf",
"type": "application/pdf",
"vector_store": "vs_abc123",
"size": 250000,
"status": "indexed",
"processing_status": "completed",
"indexed_on": "2023-06-15T10:30:00Z",
"status_error_message": null,
"created": "2023-06-15T10:15:00Z",
"modified": "2023-06-15T10:30:00Z"
},
{
"id": 1235,
"name": "Technical Specifications.docx",
"type": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"vector_store": "vs_abc123",
"size": 125000,
"status": "indexed",
"processing_status": "completed",
"indexed_on": "2023-06-15T11:45:00Z",
"status_error_message": null,
"created": "2023-06-15T11:30:00Z",
"modified": "2023-06-15T11:45:00Z"
}
]
||CODE_BLOCK||
## Authentication
This endpoint requires authentication using an API key in the `x-api-key` header.
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 401 | Unauthorized - Invalid or missing API key |
| 404 | Not Found - Vector Store not found |
| 500 | Internal Server Error |
This endpoint retrieves a list of all documents stored in the specified vector store. It provides metadata about each document, including the document's processing status, size, and indexed timestamp. This information is useful for monitoring your vector store's contents and checking document processing status.
==> developer/api-docs/vector-store/get-the-information-of-a-document.md <==
# Get the information of a document
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/{document-id}" method="get" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
https://api.rememberizer.ai/api/v1/vector-stores/vs_abc123/documents/1234 \
-H "x-api-key: YOUR_API_KEY"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getDocumentInfo = async (vectorStoreId, documentId) => {
const response = await fetch(`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents/${documentId}`, {
method: 'GET',
headers: {
'x-api-key': 'YOUR_API_KEY'
}
});
const data = await response.json();
console.log(data);
};
getDocumentInfo('vs_abc123', 1234);
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def get_document_info(vector_store_id, document_id):
headers = {
"x-api-key": "YOUR_API_KEY"
}
response = requests.get(
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents/{document_id}",
headers=headers
)
data = response.json()
print(data)
get_document_info('vs_abc123', 1234)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| vector-store-id | string | **Required.** The ID of the vector store containing the document. |
| document-id | integer | **Required.** The ID of the document to retrieve. |
## Response Format
||CODE_BLOCK||json
{
"id": 1234,
"name": "Product Manual.pdf",
"type": "application/pdf",
"vector_store": "vs_abc123",
"size": 250000,
"status": "indexed",
"processing_status": "completed",
"indexed_on": "2023-06-15T10:30:00Z",
"status_error_message": null,
"created": "2023-06-15T10:15:00Z",
"modified": "2023-06-15T10:30:00Z"
}
||CODE_BLOCK||
## Authentication
This endpoint requires authentication using an API key in the `x-api-key` header.
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 401 | Unauthorized - Invalid or missing API key |
| 404 | Not Found - Vector Store or document not found |
| 500 | Internal Server Error |
This endpoint retrieves detailed information about a specific document in the vector store. It's useful for checking the processing status of individual documents and retrieving metadata like file type, size, and timestamps. This can be particularly helpful when troubleshooting issues with document processing or when you need to verify that a document was properly indexed.
==> developer/api-docs/vector-store/README.md <==
# Vector Store APIs
The Vector Store APIs allow you to create, manage, and search vector stores in Rememberizer. Vector stores enable you to store and retrieve documents using semantic similarity search.
## Available Vector Store Endpoints
### Management Endpoints
- [Get vector store's information](get-vector-stores-information.md)
- [Get a list of documents in a Vector Store](get-a-list-of-documents-in-a-vector-store.md)
- [Get the information of a document](get-the-information-of-a-document.md)
### Document Operations
- [Add new text document to a Vector Store](add-new-text-document-to-a-vector-store.md)
- [Upload files to a Vector Store](upload-files-to-a-vector-store.md)
- [Update file's content in a Vector Store](update-files-content-in-a-vector-store.md)
- [Remove a document in Vector Store](remove-a-document-in-vector-store.md)
### Search Operations
- [Search for Vector Store documents by semantic similarity](search-for-vector-store-documents-by-semantic-similarity.md)
## Creating a Vector Store
To create a new Vector Store, use the following endpoint:
||CODE_BLOCK||
POST /api/v1/vector-stores/
||CODE_BLOCK||
### Request Body
||CODE_BLOCK||json
{
"name": "Store name",
"description": "Store description",
"embedding_model": "sentence-transformers/all-mpnet-base-v2",
"indexing_algorithm": "ivfflat",
"vector_dimension": 128,
"search_metric": "cosine_distance"
}
||CODE_BLOCK||
### Response
||CODE_BLOCK||json
{
"id": "store_id",
"name": "Vector Store Name",
"description": "Store description",
"created": "2023-05-01T00:00:00Z",
"modified": "2023-05-01T00:00:00Z"
}
||CODE_BLOCK||
## Vector Store Configurations
To retrieve available configurations for vector stores, use:
||CODE_BLOCK||
GET /api/v1/vector-stores/configs
||CODE_BLOCK||
This will return available embedding models, indexing algorithms, and search metrics that can be used when creating or configuring vector stores.
## Authentication
All Vector Store endpoints require authentication using either:
- JWT token for management operations
- API key for document and search operations
==> developer/api-docs/vector-store/get-vector-stores-information.md <==
# Get vector store's information
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/me" method="get" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
https://api.rememberizer.ai/api/v1/vector-stores/me \
-H "x-api-key: YOUR_API_KEY"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getVectorStoreInfo = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/vector-stores/me', {
method: 'GET',
headers: {
'x-api-key': 'YOUR_API_KEY'
}
});
const data = await response.json();
console.log(data);
};
getVectorStoreInfo();
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def get_vector_store_info():
headers = {
"x-api-key": "YOUR_API_KEY"
}
response = requests.get(
"https://api.rememberizer.ai/api/v1/vector-stores/me",
headers=headers
)
data = response.json()
print(data)
get_vector_store_info()
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key.
{% endhint %}
{% endtab %}
{% endtabs %}
## Response Format
||CODE_BLOCK||json
{
"id": "vs_abc123",
"name": "My Vector Store",
"description": "A vector store for product documentation",
"embedding_model": "sentence-transformers/all-mpnet-base-v2",
"indexing_algorithm": "ivfflat",
"vector_dimension": 128,
"search_metric": "cosine_distance",
"created": "2023-06-01T10:30:00Z",
"modified": "2023-06-15T14:45:00Z"
}
||CODE_BLOCK||
## Authentication
This endpoint requires authentication using an API key in the `x-api-key` header.
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 401 | Unauthorized - Invalid or missing API key |
| 404 | Not Found - Vector Store not found |
| 500 | Internal Server Error |
This endpoint retrieves information about the vector store associated with the provided API key. It's useful for checking configuration details, including the embedding model, dimensionality, and search metric being used. This information can be valuable for optimizing search queries and understanding the vector store's capabilities.
==> developer/api-docs/vector-store/search-for-vector-store-documents-by-semantic-similarity.md <==
---
description: Search Vector Store documents with semantic similarity and batch operations
type: api
last_updated: 2025-04-03
---
# Search for Vector Store documents by semantic similarity
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/search" method="get" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
"https://api.rememberizer.ai/api/v1/vector-stores/vs_abc123/documents/search?q=How%20to%20integrate%20our%20product%20with%20third-party%20systems&n=5&prev_chunks=1&next_chunks=1" \
-H "x-api-key: YOUR_API_KEY"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const searchVectorStore = async (vectorStoreId, query, numResults = 5, prevChunks = 1, nextChunks = 1) => {
const url = new URL(`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents/search`);
url.searchParams.append('q', query);
url.searchParams.append('n', numResults);
url.searchParams.append('prev_chunks', prevChunks);
url.searchParams.append('next_chunks', nextChunks);
const response = await fetch(url.toString(), {
method: 'GET',
headers: {
'x-api-key': 'YOUR_API_KEY'
}
});
const data = await response.json();
console.log(data);
};
searchVectorStore(
'vs_abc123',
'How to integrate our product with third-party systems',
5,
1,
1
);
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def search_vector_store(vector_store_id, query, num_results=5, prev_chunks=1, next_chunks=1):
headers = {
"x-api-key": "YOUR_API_KEY"
}
params = {
"q": query,
"n": num_results,
"prev_chunks": prev_chunks,
"next_chunks": next_chunks
}
response = requests.get(
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents/search",
headers=headers,
params=params
)
data = response.json()
print(data)
search_vector_store(
'vs_abc123',
'How to integrate our product with third-party systems',
5,
1,
1
)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% tab title="Ruby" %}
||CODE_BLOCK||ruby
require 'net/http'
require 'uri'
require 'json'
def search_vector_store(vector_store_id, query, num_results=5, prev_chunks=1, next_chunks=1)
uri = URI("https://api.rememberizer.ai/api/v1/vector-stores/#{vector_store_id}/documents/search")
params = {
q: query,
n: num_results,
prev_chunks: prev_chunks,
next_chunks: next_chunks
}
uri.query = URI.encode_www_form(params)
request = Net::HTTP::Get.new(uri)
request['x-api-key'] = 'YOUR_API_KEY'
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
response = http.request(request)
data = JSON.parse(response.body)
puts data
end
search_vector_store(
'vs_abc123',
'How to integrate our product with third-party systems',
5,
1,
1
)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| vector-store-id | string | **Required.** The ID of the vector store to search in. |
## Query Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| q | string | **Required.** The search query text. |
| n | integer | Number of results to return. Default: 10. |
| t | number | Matching threshold. Default: 0.7. |
| prev_chunks | integer | Number of chunks before the matched chunk to include. Default: 0. |
| next_chunks | integer | Number of chunks after the matched chunk to include. Default: 0. |
## Response Format
||CODE_BLOCK||json
{
"vector_store": {
"id": "vs_abc123",
"name": "Product Documentation"
},
"matched_chunks": [
{
"document": {
"id": 1234,
"name": "Integration Guide.pdf",
"type": "application/pdf",
"size": 250000,
"indexed_on": "2023-06-15T10:30:00Z",
"vector_store": "vs_abc123",
"created": "2023-06-15T10:15:00Z",
"modified": "2023-06-15T10:30:00Z"
},
"matched_content": "Our product offers several integration options for third-party systems. The primary method is through our RESTful API, which supports OAuth2 authentication. Additionally, you can use our SDK available in Python, JavaScript, and Java.",
"distance": 0.123
},
// ... more matched chunks
]
}
||CODE_BLOCK||
## Authentication
This endpoint requires authentication using an API key in the `x-api-key` header.
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 400 | Bad Request - Missing required parameters or invalid format |
| 401 | Unauthorized - Invalid or missing API key |
| 404 | Not Found - Vector Store not found |
| 500 | Internal Server Error |
## Search Optimization Tips
### Context Windows
Use the `prev_chunks` and `next_chunks` parameters to control how much context is included with each match:
- Set both to 0 for precise matches without context
- Set both to 1-2 for matches with minimal context
- Set both to 3-5 for matches with substantial context
### Matching Threshold
The `t` parameter controls how strictly matches are filtered:
- Higher values (e.g., 0.9) return only very close matches
- Lower values (e.g., 0.5) return more matches with greater variety
- The default (0.7) provides a balanced approach
## Batch Operations
For high-throughput applications, Rememberizer supports efficient batch operations on vector stores. These methods optimize performance when processing multiple search queries.
### Batch Search Implementation
{% tabs %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import time
import concurrent.futures
def batch_search_vector_store(vector_store_id, queries, num_results=5, batch_size=10):
"""
Perform batch searches against a vector store
Args:
vector_store_id: ID of the vector store to search
queries: List of search query strings
num_results: Number of results per query
batch_size: Number of parallel requests
Returns:
List of search results
"""
headers = {
"x-api-key": "YOUR_API_KEY"
}
results = []
# Process in batches to avoid overwhelming the API
for i in range(0, len(queries), batch_size):
batch_queries = queries[i:i+batch_size]
with concurrent.futures.ThreadPoolExecutor(max_workers=batch_size) as executor:
futures = []
for query in batch_queries:
params = {
"q": query,
"n": num_results,
"prev_chunks": 1,
"next_chunks": 1
}
# Submit the request to the thread pool
future = executor.submit(
requests.get,
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents/search",
headers=headers,
params=params
)
futures.append(future)
# Collect results from all futures
for future in futures:
response = future.result()
if response.status_code == 200:
results.append(response.json())
else:
results.append({"error": f"Failed with status code: {response.status_code}"})
# Add a delay between batches to avoid rate limiting
if i + batch_size < len(queries):
time.sleep(1)
return results
# Example usage
queries = [
"Integration with REST APIs",
"Authentication protocols",
"How to deploy to production",
"Performance optimization techniques",
"Error handling best practices"
]
search_results = batch_search_vector_store("vs_abc123", queries, num_results=3, batch_size=5)
||CODE_BLOCK||
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
/**
* Perform batch searches against a vector store
*
* @param {string} vectorStoreId - ID of the vector store
* @param {string[]} queries - List of search queries
* @param {Object} options - Configuration options
* @returns {Promise} - List of search results
*/
async function batchSearchVectorStore(vectorStoreId, queries, options = {}) {
const {
numResults = 5,
batchSize = 10,
delayBetweenBatches = 1000,
prevChunks = 1,
nextChunks = 1,
distanceThreshold = 0.7
} = options;
const results = [];
const apiKey = 'YOUR_API_KEY';
// Process in batches to manage API load
for (let i = 0; i < queries.length; i += batchSize) {
const batchQueries = queries.slice(i, i + batchSize);
// Create promise array for parallel requests
const batchPromises = batchQueries.map(query => {
const url = new URL(`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents/search`);
url.searchParams.append('q', query);
url.searchParams.append('n', numResults);
url.searchParams.append('prev_chunks', prevChunks);
url.searchParams.append('next_chunks', nextChunks);
url.searchParams.append('t', distanceThreshold);
return fetch(url.toString(), {
method: 'GET',
headers: {
'x-api-key': apiKey
}
})
.then(response => {
if (response.ok) {
return response.json();
} else {
return { error: `Failed with status: ${response.status}` };
}
})
.catch(error => {
return { error: error.message };
});
});
// Wait for all requests in batch to complete
const batchResults = await Promise.all(batchPromises);
results.push(...batchResults);
// Add delay between batches to avoid rate limiting
if (i + batchSize < queries.length) {
await new Promise(resolve => setTimeout(resolve, delayBetweenBatches));
}
}
return results;
}
// Example usage
const queries = [
"Integration with REST APIs",
"Authentication protocols",
"How to deploy to production",
"Performance optimization techniques",
"Error handling best practices"
];
const options = {
numResults: 3,
batchSize: 5,
delayBetweenBatches: 1000,
prevChunks: 1,
nextChunks: 1
};
batchSearchVectorStore("vs_abc123", queries, options)
.then(results => console.log(results))
.catch(error => console.error("Batch search failed:", error));
||CODE_BLOCK||
{% endtab %}
{% tab title="Ruby" %}
||CODE_BLOCK||ruby
require 'net/http'
require 'uri'
require 'json'
require 'concurrent'
# Perform batch searches against a vector store
#
# @param vector_store_id [String] ID of the vector store
# @param queries [Array] List of search queries
# @param num_results [Integer] Number of results per query
# @param batch_size [Integer] Number of parallel requests
# @param delay_between_batches [Float] Seconds to wait between batches
# @return [Array] Search results for each query
def batch_search_vector_store(vector_store_id, queries, num_results: 5, batch_size: 10, delay_between_batches: 1.0)
results = []
api_key = 'YOUR_API_KEY'
# Process in batches
queries.each_slice(batch_size).with_index do |batch_queries, batch_index|
# Create a thread pool for concurrent execution
pool = Concurrent::FixedThreadPool.new(batch_size)
futures = []
batch_queries.each do |query|
# Submit each request to thread pool
futures << Concurrent::Future.execute(executor: pool) do
uri = URI("https://api.rememberizer.ai/api/v1/vector-stores/#{vector_store_id}/documents/search")
params = {
q: query,
n: num_results,
prev_chunks: 1,
next_chunks: 1
}
uri.query = URI.encode_www_form(params)
request = Net::HTTP::Get.new(uri)
request['x-api-key'] = api_key
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
begin
response = http.request(request)
if response.code.to_i == 200
JSON.parse(response.body)
else
{ "error" => "Failed with status code: #{response.code}" }
end
rescue => e
{ "error" => e.message }
end
end
end
# Collect results from all futures
batch_results = futures.map(&:value)
results.concat(batch_results)
# Add delay between batches
if batch_index < (queries.length / batch_size.to_f).ceil - 1
sleep(delay_between_batches)
end
end
pool.shutdown
results
end
# Example usage
queries = [
"Integration with REST APIs",
"Authentication protocols",
"How to deploy to production",
"Performance optimization techniques",
"Error handling best practices"
]
results = batch_search_vector_store(
"vs_abc123",
queries,
num_results: 3,
batch_size: 5
)
puts results
||CODE_BLOCK||
{% endtab %}
{% endtabs %}
### Performance Optimization for Batch Operations
When implementing batch operations for vector store searches, consider these best practices:
1. **Optimal Batch Sizing**: For most applications, processing 5-10 queries in parallel provides a good balance between throughput and resource usage.
2. **Rate Limiting Awareness**: Include delay mechanisms between batches (typically 1-2 seconds) to avoid hitting API rate limits.
3. **Error Handling**: Implement robust error handling for individual queries that may fail within a batch.
4. **Connection Management**: For high-volume applications, implement connection pooling to reduce overhead.
5. **Timeout Configuration**: Set appropriate timeouts for each request to prevent long-running queries from blocking the entire batch.
6. **Result Processing**: Consider processing results asynchronously as they become available rather than waiting for all results.
7. **Monitoring**: Track performance metrics like average response time and success rates to identify optimization opportunities.
For production applications with very high query volumes, consider implementing a queue system with worker processes to manage large batches efficiently.
This endpoint allows you to search your vector store using semantic similarity. It returns documents that are conceptually related to your query, even if they don't contain the exact keywords. This makes it particularly powerful for natural language queries and question answering.
==> developer/api-docs/vector-store/add-new-text-document-to-a-vector-store.md <==
# Add new text document to a Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/create" method="post" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/vector-stores/vs_abc123/documents/create \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Product Overview",
"text": "Our product is an innovative solution for managing vector embeddings. It provides seamless integration with your existing systems and offers powerful semantic search capabilities."
}'
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const addTextDocument = async (vectorStoreId, name, text) => {
const response = await fetch(`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents/create`, {
method: 'POST',
headers: {
'x-api-key': 'YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
name: name,
text: text
})
});
const data = await response.json();
console.log(data);
};
addTextDocument(
'vs_abc123',
'Product Overview',
'Our product is an innovative solution for managing vector embeddings. It provides seamless integration with your existing systems and offers powerful semantic search capabilities.'
);
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
def add_text_document(vector_store_id, name, text):
headers = {
"x-api-key": "YOUR_API_KEY",
"Content-Type": "application/json"
}
payload = {
"name": name,
"text": text
}
response = requests.post(
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents/create",
headers=headers,
data=json.dumps(payload)
)
data = response.json()
print(data)
add_text_document(
'vs_abc123',
'Product Overview',
'Our product is an innovative solution for managing vector embeddings. It provides seamless integration with your existing systems and offers powerful semantic search capabilities.'
)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| vector-store-id | string | **Required.** The ID of the vector store to add the document to. |
## Request Body
||CODE_BLOCK||json
{
"name": "Product Overview",
"text": "Our product is an innovative solution for managing vector embeddings. It provides seamless integration with your existing systems and offers powerful semantic search capabilities."
}
||CODE_BLOCK||
| Parameter | Type | Description |
|-----------|------|-------------|
| name | string | **Required.** The name of the document. |
| text | string | **Required.** The text content of the document. |
## Response Format
||CODE_BLOCK||json
{
"id": 1234,
"name": "Product Overview",
"type": "text/plain",
"vector_store": "vs_abc123",
"size": 173,
"status": "processing",
"processing_status": "queued",
"indexed_on": null,
"status_error_message": null,
"created": "2023-06-15T10:15:00Z",
"modified": "2023-06-15T10:15:00Z"
}
||CODE_BLOCK||
## Authentication
This endpoint requires authentication using an API key in the `x-api-key` header.
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 400 | Bad Request - Missing required fields or invalid format |
| 401 | Unauthorized - Invalid or missing API key |
| 404 | Not Found - Vector Store not found |
| 500 | Internal Server Error |
This endpoint allows you to add text content directly to your vector store. It's particularly useful for storing information that might not exist in file format, such as product descriptions, knowledge base articles, or custom content. The text will be automatically processed into vector embeddings, making it searchable using semantic similarity.
==> developer/api-docs/vector-store/remove-a-document-in-vector-store.md <==
# Remove a document in Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/{document-id}/" method="delete" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X DELETE \
https://api.rememberizer.ai/api/v1/vector-stores/vs_abc123/documents/1234/ \
-H "x-api-key: YOUR_API_KEY"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const deleteDocument = async (vectorStoreId, documentId) => {
const response = await fetch(`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents/${documentId}/`, {
method: 'DELETE',
headers: {
'x-api-key': 'YOUR_API_KEY'
}
});
if (response.status === 204) {
console.log("Document deleted successfully");
} else {
console.error("Failed to delete document");
}
};
deleteDocument('vs_abc123', 1234);
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def delete_document(vector_store_id, document_id):
headers = {
"x-api-key": "YOUR_API_KEY"
}
response = requests.delete(
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents/{document_id}/",
headers=headers
)
if response.status_code == 204:
print("Document deleted successfully")
else:
print(f"Failed to delete document: {response.text}")
delete_document('vs_abc123', 1234)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| vector-store-id | string | **Required.** The ID of the vector store containing the document. |
| document-id | integer | **Required.** The ID of the document to delete. |
## Response
A successful request returns a 204 No Content status code with no response body.
## Authentication
This endpoint requires authentication using an API key in the `x-api-key` header.
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 401 | Unauthorized - Invalid or missing API key |
| 404 | Not Found - Vector Store or document not found |
| 500 | Internal Server Error |
This endpoint allows you to remove a document from your vector store. Once deleted, the document and its vector embeddings will no longer be available for search operations. This is useful for removing outdated, irrelevant, or sensitive content from your knowledge base.
{% hint style="warning" %}
Warning: Document deletion is permanent and cannot be undone. Make sure you have a backup of important documents before deleting them.
{% endhint %}
==> developer/api-docs/vector-store/update-files-content-in-a-vector-store.md <==
# Update file's content in a Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/{document-id}/" method="patch" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X PATCH \
https://api.rememberizer.ai/api/v1/vector-stores/vs_abc123/documents/1234/ \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Updated Product Overview"
}'
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const updateDocument = async (vectorStoreId, documentId, newName) => {
const response = await fetch(`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents/${documentId}/`, {
method: 'PATCH',
headers: {
'x-api-key': 'YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
name: newName
})
});
const data = await response.json();
console.log(data);
};
updateDocument('vs_abc123', 1234, 'Updated Product Overview');
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
def update_document(vector_store_id, document_id, new_name):
headers = {
"x-api-key": "YOUR_API_KEY",
"Content-Type": "application/json"
}
payload = {
"name": new_name
}
response = requests.patch(
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents/{document_id}/",
headers=headers,
data=json.dumps(payload)
)
data = response.json()
print(data)
update_document('vs_abc123', 1234, 'Updated Product Overview')
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| vector-store-id | string | **Required.** The ID of the vector store containing the document. |
| document-id | integer | **Required.** The ID of the document to update. |
## Request Body
||CODE_BLOCK||json
{
"name": "Updated Product Overview"
}
||CODE_BLOCK||
| Parameter | Type | Description |
|-----------|------|-------------|
| name | string | The new name for the document. |
## Response Format
||CODE_BLOCK||json
{
"id": 1234,
"name": "Updated Product Overview",
"type": "text/plain",
"vector_store": "vs_abc123",
"size": 173,
"status": "indexed",
"processing_status": "completed",
"indexed_on": "2023-06-15T10:30:00Z",
"status_error_message": null,
"created": "2023-06-15T10:15:00Z",
"modified": "2023-06-15T11:45:00Z"
}
||CODE_BLOCK||
## Authentication
This endpoint requires authentication using an API key in the `x-api-key` header.
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 400 | Bad Request - Invalid request format |
| 401 | Unauthorized - Invalid or missing API key |
| 404 | Not Found - Vector Store or document not found |
| 500 | Internal Server Error |
This endpoint allows you to update the metadata of a document in your vector store. Currently, you can only update the document's name. This is useful for improving document organization and discoverability without needing to re-upload the document.
{% hint style="info" %}
Note: This endpoint only updates the document's metadata, not its content. To update the content, you need to delete the existing document and upload a new one.
{% endhint %}
==> developer/api-docs/vector-store/upload-files-to-a-vector-store.md <==
---
description: Upload file content to Vector Store with batch operations
type: api
last_updated: 2025-04-03
---
# Upload files to a Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/upload" method="post" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/vector-stores/vs_abc123/documents/upload \
-H "x-api-key: YOUR_API_KEY" \
-F "files=@/path/to/document1.pdf" \
-F "files=@/path/to/document2.docx"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and provide the paths to your local files.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const uploadFiles = async (vectorStoreId, files) => {
const formData = new FormData();
// Add multiple files to the form data
for (const file of files) {
formData.append('files', file);
}
const response = await fetch(`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents/upload`, {
method: 'POST',
headers: {
'x-api-key': 'YOUR_API_KEY'
// Note: Do not set Content-Type header, it will be set automatically with the correct boundary
},
body: formData
});
const data = await response.json();
console.log(data);
};
// Example usage with file input element
const fileInput = document.getElementById('fileInput');
uploadFiles('vs_abc123', fileInput.files);
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def upload_files(vector_store_id, file_paths):
headers = {
"x-api-key": "YOUR_API_KEY"
}
files = [('files', (file_path.split('/')[-1], open(file_path, 'rb'))) for file_path in file_paths]
response = requests.post(
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents/upload",
headers=headers,
files=files
)
data = response.json()
print(data)
upload_files('vs_abc123', ['/path/to/document1.pdf', '/path/to/document2.docx'])
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and provide the paths to your local files.
{% endhint %}
{% endtab %}
{% tab title="Ruby" %}
||CODE_BLOCK||ruby
require 'net/http'
require 'uri'
require 'json'
def upload_files(vector_store_id, file_paths)
uri = URI("https://api.rememberizer.ai/api/v1/vector-stores/#{vector_store_id}/documents/upload")
# Create a new HTTP object
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
# Create a multipart-form request
request = Net::HTTP::Post.new(uri)
request['x-api-key'] = 'YOUR_API_KEY'
# Create a multipart boundary
boundary = "RubyFormBoundary#{rand(1000000)}"
request['Content-Type'] = "multipart/form-data; boundary=#{boundary}"
# Build the request body
body = []
file_paths.each do |file_path|
file_name = File.basename(file_path)
file_content = File.read(file_path, mode: 'rb')
body << "--#{boundary}\r\n"
body << "Content-Disposition: form-data; name=\"files\"; filename=\"#{file_name}\"\r\n"
body << "Content-Type: #{get_content_type(file_name)}\r\n\r\n"
body << file_content
body << "\r\n"
end
body << "--#{boundary}--\r\n"
request.body = body.join
# Send the request
response = http.request(request)
# Parse and return the response
JSON.parse(response.body)
end
# Helper method to determine content type
def get_content_type(filename)
ext = File.extname(filename).downcase
case ext
when '.pdf' then 'application/pdf'
when '.doc' then 'application/msword'
when '.docx' then 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'
when '.txt' then 'text/plain'
when '.md' then 'text/markdown'
when '.json' then 'application/json'
else 'application/octet-stream'
end
end
# Example usage
result = upload_files('vs_abc123', ['/path/to/document1.pdf', '/path/to/document2.docx'])
puts result
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and provide the paths to your local files.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| vector-store-id | string | **Required.** The ID of the vector store to upload files to. |
## Request Body
This endpoint accepts a `multipart/form-data` request with one or more files in the `files` field.
## Response Format
||CODE_BLOCK||json
{
"documents": [
{
"id": 1234,
"name": "document1.pdf",
"type": "application/pdf",
"size": 250000,
"status": "processing",
"created": "2023-06-15T10:15:00Z",
"vector_store": "vs_abc123"
},
{
"id": 1235,
"name": "document2.docx",
"type": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"size": 180000,
"status": "processing",
"created": "2023-06-15T10:15:00Z",
"vector_store": "vs_abc123"
}
],
"errors": []
}
||CODE_BLOCK||
If some files fail to upload, they will be listed in the `errors` array:
||CODE_BLOCK||json
{
"documents": [
{
"id": 1234,
"name": "document1.pdf",
"type": "application/pdf",
"size": 250000,
"status": "processing",
"created": "2023-06-15T10:15:00Z",
"vector_store": "vs_abc123"
}
],
"errors": [
{
"file": "document2.docx",
"error": "File format not supported"
}
]
}
||CODE_BLOCK||
## Authentication
This endpoint requires authentication using an API key in the `x-api-key` header.
## Supported File Formats
- PDF (`.pdf`)
- Microsoft Word (`.doc`, `.docx`)
- Microsoft Excel (`.xls`, `.xlsx`)
- Microsoft PowerPoint (`.ppt`, `.pptx`)
- Text files (`.txt`)
- Markdown (`.md`)
- JSON (`.json`)
- HTML (`.html`, `.htm`)
## File Size Limits
- Individual file size limit: 50MB
- Total request size limit: 100MB
- Maximum number of files per request: 20
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 400 | Bad Request - No files provided or invalid request format |
| 401 | Unauthorized - Invalid or missing API key |
| 404 | Not Found - Vector Store not found |
| 413 | Payload Too Large - Files exceed size limit |
| 415 | Unsupported Media Type - File format not supported |
| 500 | Internal Server Error |
| 207 | Multi-Status - Some files were uploaded successfully, but others failed |
## Processing Status
Files are initially accepted with a status of `processing`. You can check the processing status of the documents using the [Get a List of Documents in a Vector Store](get-a-list-of-documents-in-a-vector-store.md) endpoint. Final status will be one of:
- `done`: Document was successfully processed
- `error`: An error occurred during processing
- `processing`: Document is still being processed
Processing time depends on file size and complexity. Typical processing time is between 30 seconds to 5 minutes per document.
## Batch Operations
For efficiently uploading multiple files to your Vector Store, Rememberizer supports batch operations. This approach helps optimize performance when dealing with large numbers of documents.
### Batch Upload Implementation
{% tabs %}
{% tab title="Python" %}
||CODE_BLOCK||python
import os
import requests
import time
import concurrent.futures
from pathlib import Path
def batch_upload_to_vector_store(vector_store_id, folder_path, batch_size=5, file_types=None):
"""
Upload all files from a directory to a Vector Store in batches
Args:
vector_store_id: ID of the vector store
folder_path: Path to folder containing files to upload
batch_size: Number of files to upload in each batch
file_types: Optional list of file extensions to filter by (e.g., ['.pdf', '.docx'])
Returns:
List of upload results
"""
api_key = "YOUR_API_KEY"
headers = {"x-api-key": api_key}
# Get list of files in directory
files = []
for entry in os.scandir(folder_path):
if entry.is_file():
file_path = Path(entry.path)
# Filter by file extension if specified
if file_types is None or file_path.suffix.lower() in file_types:
files.append(file_path)
print(f"Found {len(files)} files to upload")
results = []
# Process files in batches
for i in range(0, len(files), batch_size):
batch = files[i:i+batch_size]
print(f"Processing batch {i//batch_size + 1}/{(len(files) + batch_size - 1)//batch_size}: {len(batch)} files")
# Upload batch
upload_files = []
for file_path in batch:
upload_files.append(('files', (file_path.name, open(file_path, 'rb'))))
try:
response = requests.post(
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents/upload",
headers=headers,
files=upload_files
)
# Close all file handles
for _, (_, file_obj) in upload_files:
file_obj.close()
if response.status_code in (200, 201, 207):
batch_result = response.json()
results.append(batch_result)
print(f"Successfully uploaded batch - {len(batch_result.get('documents', []))} documents processed")
# Check for errors
if batch_result.get('errors') and len(batch_result['errors']) > 0:
print(f"Errors encountered: {len(batch_result['errors'])}")
for error in batch_result['errors']:
print(f"- {error['file']}: {error['error']}")
else:
print(f"Batch upload failed with status code {response.status_code}: {response.text}")
results.append({"error": f"Batch failed: {response.text}"})
except Exception as e:
print(f"Exception during batch upload: {str(e)}")
results.append({"error": str(e)})
# Close any remaining file handles in case of exception
for _, (_, file_obj) in upload_files:
try:
file_obj.close()
except:
pass
# Rate limiting - pause between batches
if i + batch_size < len(files):
print("Pausing before next batch...")
time.sleep(2)
return results
# Example usage
results = batch_upload_to_vector_store(
'vs_abc123',
'/path/to/documents/folder',
batch_size=5,
file_types=['.pdf', '.docx', '.txt']
)
||CODE_BLOCK||
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
/**
* Upload files to a Vector Store in batches
*
* @param {string} vectorStoreId - ID of the Vector Store
* @param {FileList|File[]} files - Files to upload
* @param {Object} options - Configuration options
* @returns {Promise} - List of upload results
*/
async function batchUploadToVectorStore(vectorStoreId, files, options = {}) {
const {
batchSize = 5,
delayBetweenBatches = 2000,
onProgress = null
} = options;
const apiKey = 'YOUR_API_KEY';
const results = [];
const fileList = Array.from(files);
const totalBatches = Math.ceil(fileList.length / batchSize);
console.log(`Preparing to upload ${fileList.length} files in ${totalBatches} batches`);
// Process files in batches
for (let i = 0; i < fileList.length; i += batchSize) {
const batch = fileList.slice(i, i + batchSize);
const batchNumber = Math.floor(i / batchSize) + 1;
console.log(`Processing batch ${batchNumber}/${totalBatches}: ${batch.length} files`);
if (onProgress) {
onProgress({
currentBatch: batchNumber,
totalBatches: totalBatches,
filesInBatch: batch.length,
totalFiles: fileList.length,
completedFiles: i
});
}
// Create FormData for this batch
const formData = new FormData();
batch.forEach(file => {
formData.append('files', file);
});
try {
const response = await fetch(
`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents/upload`,
{
method: 'POST',
headers: {
'x-api-key': apiKey
},
body: formData
}
);
if (response.ok) {
const batchResult = await response.json();
results.push(batchResult);
console.log(`Successfully uploaded batch - ${batchResult.documents?.length || 0} documents processed`);
// Check for errors
if (batchResult.errors && batchResult.errors.length > 0) {
console.warn(`Errors encountered: ${batchResult.errors.length}`);
batchResult.errors.forEach(error => {
console.warn(`- ${error.file}: ${error.error}`);
});
}
} else {
console.error(`Batch upload failed with status ${response.status}: ${await response.text()}`);
results.push({ error: `Batch failed with status: ${response.status}` });
}
} catch (error) {
console.error(`Exception during batch upload: ${error.message}`);
results.push({ error: error.message });
}
// Add delay between batches to avoid rate limiting
if (i + batchSize < fileList.length) {
console.log(`Pausing for ${delayBetweenBatches}ms before next batch...`);
await new Promise(resolve => setTimeout(resolve, delayBetweenBatches));
}
}
console.log(`Upload complete. Processed ${fileList.length} files.`);
return results;
}
// Example usage with file input element
document.getElementById('upload-button').addEventListener('click', async () => {
const fileInput = document.getElementById('file-input');
const vectorStoreId = 'vs_abc123';
const progressBar = document.getElementById('progress-bar');
try {
const results = await batchUploadToVectorStore(vectorStoreId, fileInput.files, {
batchSize: 5,
onProgress: (progress) => {
// Update progress UI
const percentage = Math.round((progress.completedFiles / progress.totalFiles) * 100);
progressBar.style.width = `${percentage}%`;
progressBar.textContent = `${percentage}% (Batch ${progress.currentBatch}/${progress.totalBatches})`;
}
});
console.log('Complete upload results:', results);
} catch (error) {
console.error('Upload failed:', error);
}
});
||CODE_BLOCK||
{% endtab %}
{% tab title="Ruby" %}
||CODE_BLOCK||ruby
require 'net/http'
require 'uri'
require 'json'
require 'mime/types'
# Upload files to a Vector Store in batches
#
# @param vector_store_id [String] ID of the Vector Store
# @param folder_path [String] Path to folder containing files to upload
# @param batch_size [Integer] Number of files to upload in each batch
# @param file_types [Array] Optional array of file extensions to filter by
# @param delay_between_batches [Float] Seconds to wait between batches
# @return [Array] List of upload results
def batch_upload_to_vector_store(vector_store_id, folder_path, batch_size: 5, file_types: nil, delay_between_batches: 2.0)
api_key = 'YOUR_API_KEY'
results = []
# Get list of files in directory
files = Dir.entries(folder_path)
.select { |f| File.file?(File.join(folder_path, f)) }
.select { |f| file_types.nil? || file_types.include?(File.extname(f).downcase) }
.map { |f| File.join(folder_path, f) }
puts "Found #{files.count} files to upload"
total_batches = (files.count.to_f / batch_size).ceil
# Process files in batches
files.each_slice(batch_size).with_index do |batch, batch_index|
puts "Processing batch #{batch_index + 1}/#{total_batches}: #{batch.count} files"
# Prepare the HTTP request
uri = URI("https://api.rememberizer.ai/api/v1/vector-stores/#{vector_store_id}/documents/upload")
request = Net::HTTP::Post.new(uri)
request['x-api-key'] = api_key
# Create a multipart form boundary
boundary = "RubyBoundary#{rand(1000000)}"
request['Content-Type'] = "multipart/form-data; boundary=#{boundary}"
# Build the request body
body = []
batch.each do |file_path|
file_name = File.basename(file_path)
mime_type = MIME::Types.type_for(file_path).first&.content_type || 'application/octet-stream'
begin
file_content = File.binread(file_path)
body << "--#{boundary}\r\n"
body << "Content-Disposition: form-data; name=\"files\"; filename=\"#{file_name}\"\r\n"
body << "Content-Type: #{mime_type}\r\n\r\n"
body << file_content
body << "\r\n"
rescue => e
puts "Error reading file #{file_path}: #{e.message}"
end
end
body << "--#{boundary}--\r\n"
request.body = body.join
# Send the request
begin
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
response = http.request(request)
if response.code.to_i == 200 || response.code.to_i == 201 || response.code.to_i == 207
batch_result = JSON.parse(response.body)
results << batch_result
puts "Successfully uploaded batch - #{batch_result['documents']&.count || 0} documents processed"
# Check for errors
if batch_result['errors'] && !batch_result['errors'].empty?
puts "Errors encountered: #{batch_result['errors'].count}"
batch_result['errors'].each do |error|
puts "- #{error['file']}: #{error['error']}"
end
end
else
puts "Batch upload failed with status code #{response.code}: #{response.body}"
results << { "error" => "Batch failed: #{response.body}" }
end
rescue => e
puts "Exception during batch upload: #{e.message}"
results << { "error" => e.message }
end
# Rate limiting - pause between batches
if batch_index < total_batches - 1
puts "Pausing for #{delay_between_batches} seconds before next batch..."
sleep(delay_between_batches)
end
end
puts "Upload complete. Processed #{files.count} files."
results
end
# Example usage
results = batch_upload_to_vector_store(
'vs_abc123',
'/path/to/documents/folder',
batch_size: 5,
file_types: ['.pdf', '.docx', '.txt'],
delay_between_batches: 2.0
)
||CODE_BLOCK||
{% endtab %}
{% endtabs %}
### Batch Upload Best Practices
To optimize performance and reliability when uploading large volumes of files:
1. **Manage Batch Size**: Keep batch sizes between 5-10 files for optimal performance. Too many files in a single request increases the risk of timeouts.
2. **Implement Rate Limiting**: Add delays between batches (2-3 seconds recommended) to avoid hitting API rate limits.
3. **Add Error Retry Logic**: For production systems, implement retry logic for failed uploads with exponential backoff.
4. **Validate File Types**: Pre-filter files to ensure they're supported types before attempting upload.
5. **Monitor Batch Progress**: For user-facing applications, provide progress feedback on batch operations.
6. **Handle Partial Success**: The API may return a 207 status code for partial success. Always check individual document statuses.
7. **Clean Up Resources**: Ensure all file handles are properly closed, especially when errors occur.
8. **Parallelize Wisely**: For very large uploads (thousands of files), consider multiple concurrent batch processes targeting different vector stores, then combine results later if needed.
9. **Implement Checksums**: For critical data, verify file integrity before and after upload with checksums.
10. **Log Comprehensive Results**: Maintain detailed logs of all upload operations for troubleshooting.
By following these best practices, you can efficiently manage large-scale document ingestion into your vector stores.
```