Skip to main content
Identifies anomalous or outlier records using K-nearest neighbors analysis on vector embeddings. This endpoint is perfect for fraud detection, data quality checks, and identifying unusual patterns in your data.
POST https://api.worqhat.com/db/detect-anomalies

What Does This Endpoint Do?

This endpoint uses advanced machine learning techniques to identify records that are significantly different from the majority of your data. It analyzes vector embeddings to find outliers that might indicate fraud, errors, or interesting patterns that warrant investigation.

When to Use Anomaly Detection

You’ll find this endpoint useful when you need to:
  • Fraud detection: Identify suspicious transactions, accounts, or activities
  • Data quality assurance: Find records that might contain errors or inconsistencies
  • Security monitoring: Detect unusual patterns that could indicate security breaches
  • Business intelligence: Discover outliers that might represent new opportunities or risks
  • Compliance monitoring: Identify records that don’t conform to expected patterns
  • Research and analysis: Find interesting edge cases or unusual data points

How It Works

  1. You specify a table to analyze for anomalies
  2. The system retrieves vector embeddings for all records in the table
  3. It performs K-nearest neighbors analysis to calculate anomaly scores
  4. Records with high anomaly scores are identified as outliers
  5. Results include nearest neighbors and distance metrics for each anomaly

Code Examples

Example 1: Basic Anomaly Detection

This example shows how to detect anomalies in a transactions table.
  • Node.js
  • Python
  • Go
  • cURL
import Worqhat from 'worqhat';

// Initialize the client with your API key
const client = new Worqhat({
  apiKey: process.env.WORQHAT_API_KEY, // Always use environment variables for API keys
});

async function detectAnomalies() {
  try {
    // Call the detectAnomalies method
    const response = await client.db.detectAnomalies({
      table: "transactions",
      k: 10,
      threshold: 0.8,
      limit: 50,
      environment: process.env.WORQHAT_ENVIRONMENT || 'production' // Defaults to production
    });
    
    // Handle the successful response
    console.log(`Analyzed ${response.total_records} records`);
    console.log(`Found ${response.anomaly_count} anomalies`);
    console.log('Anomalies:', response.anomalies);
    return response;
  } catch (error) {
    // Handle any errors
    console.error('Error detecting anomalies:', error.message);
  }
}

// Call the function
detectAnomalies();

Example 2: Fraud Detection with Custom Parameters

This example shows how to tune parameters for fraud detection in financial transactions.
  • Node.js
  • Python
  • Go
  • cURL
import Worqhat from 'worqhat';

// Initialize the client with your API key
const client = new Worqhat({
  apiKey: process.env.WORQHAT_API_KEY,
});

async function detectFraud() {
  try {
    // Detect fraud with stricter parameters
    const response = await client.db.detectAnomalies({
      table: "financial_transactions",
      k: 15,              // More neighbors for better accuracy
      threshold: 0.9,     // Higher threshold for stricter detection
      limit: 100,         // More results to investigate
      environment: process.env.WORQHAT_ENVIRONMENT || 'production' // Defaults to production
    });
    
    // Handle the successful response
    console.log(`Fraud detection analysis complete`);
    console.log(`Total transactions analyzed: ${response.total_records}`);
    console.log(`Potential fraud cases: ${response.anomaly_count}`);
    
    // Process each anomaly
    response.anomalies.forEach((anomaly, index) => {
      console.log(`\nAnomaly ${index + 1}:`);
      console.log(`- Transaction ID: ${anomaly.record.transaction_id}`);
      console.log(`- Amount: $${anomaly.record.amount}`);
      console.log(`- Anomaly Score: ${anomaly.anomaly_score}`);
      console.log(`- Average Distance: ${anomaly.avg_distance}`);
    });
    return response;
  } catch (error) {
    console.error('Error detecting fraud:', error.message);
  }
}

detectFraud();

Example 3: Data Quality Check

This example shows how to use anomaly detection for data quality assurance.
  • Node.js
  • Python
  • Go
  • cURL
import Worqhat from 'worqhat';

// Initialize the client with your API key
const client = new Worqhat({
  apiKey: process.env.WORQHAT_API_KEY,
});

async function checkDataQuality() {
  try {
    // Check data quality with moderate parameters
    const response = await client.db.detectAnomalies({
      table: "user_profiles",
      k: 8,               // Moderate number of neighbors
      threshold: 0.7,      // Moderate threshold for data quality
      limit: 30,           // Reasonable number of results
      environment: process.env.WORQHAT_ENVIRONMENT || 'production' // Defaults to production
    });
    
    // Handle the successful response
    console.log(`Data quality check complete`);
    console.log(`Total profiles analyzed: ${response.total_records}`);
    console.log(`Potential data quality issues: ${response.anomaly_count}`);
    
    // Analyze anomalies for data quality issues
    response.anomalies.forEach((anomaly, index) => {
      console.log(`\nData Quality Issue ${index + 1}:`);
      console.log(`- User ID: ${anomaly.record.user_id}`);
      console.log(`- Profile: ${JSON.stringify(anomaly.record, null, 2)}`);
      console.log(`- Anomaly Score: ${anomaly.anomaly_score}`);
      console.log(`- Nearest Neighbors: ${anomaly.nearest_neighbors.length}`);
    });
    return response;
  } catch (error) {
    console.error('Error checking data quality:', error.message);
  }
}

checkDataQuality();

Request Body Explained

table
string
required
Table to analyze for anomalies.Example: “transactions”
k
number
Number of nearest neighbors to consider for anomaly detection. Range: 1-50, default: 10. Higher values provide more stable results but may miss subtle anomalies.
threshold
number
Minimum anomaly score threshold (0-1). Only records above this threshold are considered anomalies. Default: 0.8. Higher values detect fewer, more extreme anomalies.
limit
number
Maximum number of anomalies to return. Range: 1-100, default: 50.

Response Fields Explained

success
boolean
true if anomaly detection was successful, false otherwise.
anomalies
array
Array of detected anomalies, each containing:
  • record: The anomalous record data
  • anomaly_score: Anomaly score (0-1, higher = more unusual)
  • avg_distance: Average distance to K nearest neighbors
  • nearest_neighbors: Array of nearest neighbor records with distances
total_records
number
Total number of records analyzed.
anomaly_count
number
Number of anomalies detected.
parameters
object
Parameters used for the analysis (k, threshold).
executionTime
number
Analysis execution time in milliseconds.

Example Response

{
  "success": true,
  "anomalies": [
    {
      "record": {
        "id": "txn_123",
        "amount": 50000.00,
        "currency": "USD",
        "merchant": "Unknown",
        "location": "Remote"
      },
      "anomaly_score": 0.923,
      "avg_distance": 0.456,
      "nearest_neighbors": [
        {
          "record": {
            "id": "txn_456",
            "amount": 25.50,
            "currency": "USD",
            "merchant": "Coffee Shop",
            "location": "Local"
          },
          "distance": 0.234
        },
        {
          "record": {
            "id": "txn_789",
            "amount": 15.75,
            "currency": "USD",
            "merchant": "Restaurant",
            "location": "Local"
          },
          "distance": 0.312
        }
      ]
    }
  ],
  "total_records": 1000,
  "anomaly_count": 15,
  "parameters": {
    "k": 10,
    "threshold": 0.8
  },
  "executionTime": 1247
}

Common Errors and How to Fix Them

ErrorCauseSolution
”Table not found”The specified table doesn’t existCheck your table name for typos
”No embeddings found”The table doesn’t have vector embeddingsEnsure your table has been processed for embeddings
”Insufficient data”Not enough records for meaningful analysisEnsure you have at least 20+ records in the table
”K value too large”K parameter exceeds available dataReduce K value or ensure you have more records
”Threshold too high”No records meet the anomaly thresholdLower the threshold value (try 0.6-0.7)
“Unauthorized”Invalid or missing API keyCheck that you’re using a valid API key

Tips for Better Anomaly Detection

  • Start with moderate parameters: Begin with k=10, threshold=0.8 and adjust based on results
  • Consider your use case:
    • Fraud detection: Higher threshold (0.9+), more neighbors (k=15+)
    • Data quality: Moderate threshold (0.7), moderate neighbors (k=8-10)
    • General analysis: Lower threshold (0.6), fewer neighbors (k=5-8)
  • Review nearest neighbors: Examine what “normal” records look like to understand anomalies
  • Monitor execution time: Larger datasets take longer to analyze
  • Validate results: Manually review detected anomalies to ensure they make sense
  • Iterate and refine: Adjust parameters based on the quality of detected anomalies

Parameter Tuning Guidelines

K (Number of Neighbors)

  • Low K (5-8): More sensitive to local variations, may detect subtle anomalies
  • Medium K (10-15): Balanced approach, good for most use cases
  • High K (20+): More stable, focuses on major outliers, less sensitive to noise

Threshold (Anomaly Score)

  • Low Threshold (0.6-0.7): Detects more anomalies, including minor outliers
  • Medium Threshold (0.8): Balanced detection, good default
  • High Threshold (0.9+): Only detects major anomalies, very strict

Use Cases

Fraud Detection

// Detect suspicious financial transactions
const fraudAnomalies = await client.db.detectAnomalies({
  table: "transactions",
  k: 15,
  threshold: 0.9,
  limit: 100
});

Data Quality Assurance

// Find records with potential data quality issues
const qualityIssues = await client.db.detectAnomalies({
  table: "user_profiles",
  k: 8,
  threshold: 0.7,
  limit: 50
});

Security Monitoring

// Detect unusual user behavior patterns
const securityAnomalies = await client.db.detectAnomalies({
  table: "user_activities",
  k: 12,
  threshold: 0.85,
  limit: 30
});

Business Intelligence

// Find interesting outliers in business data
const businessAnomalies = await client.db.detectAnomalies({
  table: "sales_data",
  k: 10,
  threshold: 0.75,
  limit: 20
});