Skip to main content
Finds records similar to a specific existing record using vector embeddings. This endpoint is perfect for implementing “More like this” functionality, discovering related content, and performing cross-table similarity searches.
POST https://api.worqhat.com/db/find-similar

What Does This Endpoint Do?

This endpoint takes an existing record as a reference and finds other records that are semantically similar to it. It’s like having a smart assistant that can say “if you like this, you might also like these” based on deep understanding of content similarity rather than simple keyword matching.

When to Use Find Similar Records

You’ll find this endpoint useful when you need to:
  • Implement “More like this” features: Show users similar products, articles, or content
  • Discover related items: Find records that are conceptually related to a specific item
  • Cross-table recommendations: Find similar content across different tables
  • Content discovery: Help users explore related content they might not have found otherwise
  • Data analysis: Identify patterns and relationships in your data
  • Duplicate detection: Find potentially duplicate or very similar records

How It Works

  1. You specify a source record by providing its table and ID
  2. The system retrieves the vector embedding of that record
  3. It searches for other records with similar embeddings in the same or different table
  4. Results are returned with similarity scores showing how closely each record matches the source
  5. You can optionally exclude the source record from results

Code Examples

This example shows how to find records similar to a specific product.
  • Node.js
  • Python
  • Go
  • cURL
import Worqhat from 'worqhat';

// Initialize the client with your API key
const client = new Worqhat({
  apiKey: process.env.WORQHAT_API_KEY, // Always use environment variables for API keys
});

async function findSimilarProducts() {
  try {
    // Call the findSimilar method
    const response = await client.db.findSimilar({
      table: "products",
      record_id: "123",
      limit: 10,
      threshold: 0.7,
      exclude_self: true,
      environment: process.env.WORQHAT_ENVIRONMENT || 'production' // Defaults to production
    });
    
    // Handle the successful response
    console.log(`Found ${response.similar_records.length} similar products`);
    console.log('Source product:', response.source_record);
    console.log('Similar products:', response.similar_records);
    return response;
  } catch (error) {
    // Handle any errors
    console.error('Error finding similar products:', error.message);
  }
}

// Call the function
findSimilarProducts();
This example shows how to find similar records across different tables.
  • Node.js
  • Python
  • Go
  • cURL
import Worqhat from 'worqhat';

// Initialize the client with your API key
const client = new Worqhat({
  apiKey: process.env.WORQHAT_API_KEY,
});

async function findCrossTableSimilar() {
  try {
    // Find similar records across different tables
    const response = await client.db.findSimilar({
      table: "products",        // Source table
      record_id: "123",        // Source record ID
      target_table: "articles", // Search in different table
      limit: 15,
      threshold: 0.6,
      exclude_self: true,
      environment: process.env.WORQHAT_ENVIRONMENT || 'production' // Defaults to production
    });
    
    // Handle the successful response
    console.log(`Found ${response.similar_records.length} similar articles`);
    console.log('Source product:', response.source_record);
    response.similar_records.forEach(record => {
      console.log(`Article: ${record.record.title}, Similarity: ${record.similarity}`);
    });
    return response;
  } catch (error) {
    console.error('Error finding cross-table similar records:', error.message);
  }
}

findCrossTableSimilar();

Example 3: Include Source Record

This example shows how to include the source record in the results for comparison.
  • Node.js
  • Python
  • Go
  • cURL
import Worqhat from 'worqhat';

// Initialize the client with your API key
const client = new Worqhat({
  apiKey: process.env.WORQHAT_API_KEY,
});

async function findSimilarWithSource() {
  try {
    // Find similar records including the source
    const response = await client.db.findSimilar({
      table: "documents",
      record_id: "456",
      limit: 20,
      threshold: 0.5,
      exclude_self: false,  // Include the source record
      environment: process.env.WORQHAT_ENVIRONMENT || 'production' // Defaults to production
    });
    
    // Handle the successful response
    console.log(`Found ${response.similar_records.length} similar documents (including source)`);
    response.similar_records.forEach((record, index) => {
      const isSource = record.similarity === 1.0;
      console.log(`${index + 1}. ${record.record.title} (Similarity: ${record.similarity}) ${isSource ? '[SOURCE]' : ''}`);
    });
    return response;
  } catch (error) {
    console.error('Error finding similar documents:', error.message);
  }
}

findSimilarWithSource();

Request Body Explained

table
string
required
Table containing the source record.Example: “products”
record_id
string|number
required
ID of the source record to find similar records for.Example: “123”
limit
number
Maximum number of similar records to return. Range: 1-100, default: 10.
threshold
number
Minimum similarity score threshold (0-1). Only results above this threshold will be returned. Default: 0.7.
exclude_self
boolean
Whether to exclude the source record from results. Default: true.
target_table
string
Different table to search in for similar records. If not provided, searches in the same table as the source.Example: “articles”

Response Fields Explained

success
boolean
true if the search was successful, false otherwise.
source_record
object
The original record used for similarity search.
similar_records
array
Array of similar records, each containing:
  • table: The table where the similar record was found
  • record: The actual record data
  • similarity: Similarity score (0-1) indicating how similar the record is to the source
executionTime
number
Search execution time in milliseconds.

Example Response

{
  "success": true,
  "source_record": {
    "id": "123",
    "name": "AI-Powered Smart Assistant",
    "description": "Advanced machine learning device for home automation",
    "category": "electronics",
    "price": 299.99
  },
  "similar_records": [
    {
      "table": "products",
      "record": {
        "id": "456",
        "name": "Smart Home Hub",
        "description": "Central control unit for automated home systems",
        "category": "electronics",
        "price": 199.99
      },
      "similarity": 0.891
    },
    {
      "table": "products",
      "record": {
        "id": "789",
        "name": "AI Voice Assistant",
        "description": "Intelligent voice-controlled device for home automation",
        "category": "electronics",
        "price": 149.99
      },
      "similarity": 0.823
    }
  ],
  "executionTime": 156
}

Common Errors and How to Fix Them

ErrorCauseSolution
”Record not found”The specified record ID doesn’t existCheck that the record ID exists in the specified table
”Table not found”The specified table doesn’t existCheck your table name for typos
”No embeddings found”The table doesn’t have vector embeddingsEnsure your table has been processed for embeddings
”Threshold too high”No records meet the similarity thresholdLower the threshold value (try 0.5-0.6)
“Unauthorized”Invalid or missing API keyCheck that you’re using a valid API key

Tips for Better Results

  • Start with moderate thresholds: Begin with 0.6-0.7 and adjust based on your needs
  • Use cross-table search: Find similar content across different tables for broader discovery
  • Include source for comparison: Set exclude_self: false to see how the source ranks
  • Experiment with limits: Try different limit values to find the right balance
  • Monitor similarity scores: Higher scores indicate better matches
  • Use for content discovery: Great for helping users find related content they might like

Use Cases

E-commerce “More Like This”

// Find products similar to what a user is viewing
const similarProducts = await client.db.findSimilar({
  table: "products",
  record_id: currentProductId,
  limit: 8,
  threshold: 0.7
});

Content Recommendation

// Find articles similar to one a user is reading
const relatedArticles = await client.db.findSimilar({
  table: "articles",
  record_id: currentArticleId,
  limit: 5,
  threshold: 0.6
});

Cross-Table Discovery

// Find products similar to a blog post about technology
const relatedProducts = await client.db.findSimilar({
  table: "blog_posts",
  record_id: techPostId,
  target_table: "products",
  limit: 10,
  threshold: 0.5
});

Duplicate Detection

// Find potentially duplicate records
const potentialDuplicates = await client.db.findSimilar({
  table: "contacts",
  record_id: contactId,
  limit: 20,
  threshold: 0.9,
  exclude_self: true
});