Database

Initialize the client with your API key

Identifies anomalous or outlier records using K-nearest neighbors analysis on vector embeddings. This endpoint is perfect for fraud detection, data quality checks, and identifying unusual patterns in your data.

POST https://api.worqhat.com/db/detect-anomalies

What Does This Endpoint Do?

This endpoint uses advanced machine learning techniques to identify records that are significantly different from the majority of your data. It analyzes vector embeddings to find outliers that might indicate fraud, errors, or interesting patterns that warrant investigation.

When to Use Anomaly Detection

You'll find this endpoint useful when you need to:

  • Fraud detection: Identify suspicious transactions, accounts, or activities
  • Data quality assurance: Find records that might contain errors or inconsistencies
  • Security monitoring: Detect unusual patterns that could indicate security breaches
  • Business intelligence: Discover outliers that might represent new opportunities or risks
  • Compliance monitoring: Identify records that don't conform to expected patterns
  • Research and analysis: Find interesting edge cases or unusual data points

How It Works

  1. You specify a table to analyze for anomalies
  2. The system retrieves vector embeddings for all records in the table
  3. It performs K-nearest neighbors analysis to calculate anomaly scores
  4. Records with high anomaly scores are identified as outliers
  5. Results include nearest neighbors and distance metrics for each anomaly

Code Examples

Example 1: Basic Anomaly Detection

This example shows how to detect anomalies in a transactions table.

JavaScript

detect-anomalies.js

import Worqhat from 'worqhat';// Initialize the client with your API keyconst client = new Worqhat({apiKey: process.env.WORQHAT_API_KEY, // Always use environment variables for API keys});async function detectAnomalies() {try {  // Call the detectAnomalies method  const response = await client.db.detectAnomalies({    table: "transactions",    k: 10,    threshold: 0.8,    limit: 50,    environment: process.env.WORQHAT_ENVIRONMENT || 'production' // Defaults to production  });    // Handle the successful response  console.log(`Analyzed ${response.total_records} records`);  console.log(`Found ${response.anomaly_count} anomalies`);  console.log('Anomalies:', response.anomalies);  return response;} catch (error) {  // Handle any errors  console.error('Error detecting anomalies:', error.message);}}// Call the functiondetectAnomalies();

Example 2: Fraud Detection with Custom Parameters

This example shows how to tune parameters for fraud detection in financial transactions.

JavaScript

detect-fraud.js

import Worqhat from 'worqhat';// Initialize the client with your API keyconst client = new Worqhat({apiKey: process.env.WORQHAT_API_KEY,});async function detectFraud() {try {  // Detect fraud with stricter parameters  const response = await client.db.detectAnomalies({    table: "financial_transactions",    k: 15,              // More neighbors for better accuracy    threshold: 0.9,     // Higher threshold for stricter detection    limit: 100,         // More results to investigate    environment: process.env.WORQHAT_ENVIRONMENT || 'production' // Defaults to production  });    // Handle the successful response  console.log(`Fraud detection analysis complete`);  console.log(`Total transactions analyzed: ${response.total_records}`);  console.log(`Potential fraud cases: ${response.anomaly_count}`);    // Process each anomaly  response.anomalies.forEach((anomaly, index) => {    console.log(`\nAnomaly ${index + 1}:`);    console.log(`- Transaction ID: ${anomaly.record.transaction_id}`);    console.log(`- Amount: $${anomaly.record.amount}`);    console.log(`- Anomaly Score: ${anomaly.anomaly_score}`);    console.log(`- Average Distance: ${anomaly.avg_distance}`);  });  return response;} catch (error) {  console.error('Error detecting fraud:', error.message);}}detectFraud();

Example 3: Data Quality Check

This example shows how to use anomaly detection for data quality assurance.

JavaScript

data-quality.js

import Worqhat from 'worqhat';// Initialize the client with your API keyconst client = new Worqhat({apiKey: process.env.WORQHAT_API_KEY,});async function checkDataQuality() {try {  // Check data quality with moderate parameters  const response = await client.db.detectAnomalies({    table: "user_profiles",    k: 8,               // Moderate number of neighbors    threshold: 0.7,      // Moderate threshold for data quality    limit: 30,           // Reasonable number of results    environment: process.env.WORQHAT_ENVIRONMENT || 'production' // Defaults to production  });    // Handle the successful response  console.log(`Data quality check complete`);  console.log(`Total profiles analyzed: ${response.total_records}`);  console.log(`Potential data quality issues: ${response.anomaly_count}`);    // Analyze anomalies for data quality issues  response.anomalies.forEach((anomaly, index) => {    console.log(`\nData Quality Issue ${index + 1}:`);    console.log(`- User ID: ${anomaly.record.user_id}`);    console.log(`- Profile: ${JSON.stringify(anomaly.record, null, 2)}`);    console.log(`- Anomaly Score: ${anomaly.anomaly_score}`);    console.log(`- Nearest Neighbors: ${anomaly.nearest_neighbors.length}`);  });  return response;} catch (error) {  console.error('Error checking data quality:', error.message);}}checkDataQuality();

Request Body Explained

tablestringbodyrequired

Table to analyze for anomalies.

Example: "transactions"

knumberbodyoptional

Number of nearest neighbors to consider for anomaly detection. Range: 1-50, default: 10. Higher values provide more stable results but may miss subtle anomalies.

thresholdnumberbodyoptional

Minimum anomaly score threshold (0-1). Only records above this threshold are considered anomalies. Default: 0.8. Higher values detect fewer, more extreme anomalies.

limitnumberbodyoptional

Maximum number of anomalies to return. Range: 1-100, default: 50.

Response Fields Explained

successbooleanrequired

true if anomaly detection was successful, false otherwise.

anomaliesarrayrequired

Array of detected anomalies, each containing:

  • record: The anomalous record data
  • anomaly_score: Anomaly score (0-1, higher = more unusual)
  • avg_distance: Average distance to K nearest neighbors
  • nearest_neighbors: Array of nearest neighbor records with distances
total_recordsnumberrequired

Total number of records analyzed.

anomaly_countnumberrequired

Number of anomalies detected.

parametersobjectrequired

Parameters used for the analysis (k, threshold).

executionTimenumberrequired

Analysis execution time in milliseconds.

Example Response

response.json

{"success": true,"anomalies": [  {    "record": {      "id": "txn_123",      "amount": 50000.00,      "currency": "USD",      "merchant": "Unknown",      "location": "Remote"    },    "anomaly_score": 0.923,    "avg_distance": 0.456,    "nearest_neighbors": [      {        "record": {          "id": "txn_456",          "amount": 25.50,          "currency": "USD",          "merchant": "Coffee Shop",          "location": "Local"        },        "distance": 0.234      },      {        "record": {          "id": "txn_789",          "amount": 15.75,          "currency": "USD",          "merchant": "Restaurant",          "location": "Local"        },        "distance": 0.312      }    ]  }],"total_records": 1000,"anomaly_count": 15,"parameters": {  "k": 10,  "threshold": 0.8},"executionTime": 1247}

Common Errors and How to Fix Them

ErrorCauseSolution
"Table not found"The specified table doesn't existCheck your table name for typos
"No embeddings found"The table doesn't have vector embeddingsEnsure your table has been processed for embeddings
"Insufficient data"Not enough records for meaningful analysisEnsure you have at least 20+ records in the table
"K value too large"K parameter exceeds available dataReduce K value or ensure you have more records
"Threshold too high"No records meet the anomaly thresholdLower the threshold value (try 0.6-0.7)
"Unauthorized"Invalid or missing API keyCheck that you're using a valid API key

Tips for Better Anomaly Detection

  • Start with moderate parameters: Begin with k=10, threshold=0.8 and adjust based on results
  • Consider your use case:
    • Fraud detection: Higher threshold (0.9+), more neighbors (k=15+)
    • Data quality: Moderate threshold (0.7), moderate neighbors (k=8-10)
    • General analysis: Lower threshold (0.6), fewer neighbors (k=5-8)
  • Review nearest neighbors: Examine what "normal" records look like to understand anomalies
  • Monitor execution time: Larger datasets take longer to analyze
  • Validate results: Manually review detected anomalies to ensure they make sense
  • Iterate and refine: Adjust parameters based on the quality of detected anomalies

Parameter Tuning Guidelines

K (Number of Neighbors)

  • Low K (5-8): More sensitive to local variations, may detect subtle anomalies
  • Medium K (10-15): Balanced approach, good for most use cases
  • High K (20+): More stable, focuses on major outliers, less sensitive to noise

Threshold (Anomaly Score)

  • Low Threshold (0.6-0.7): Detects more anomalies, including minor outliers
  • Medium Threshold (0.8): Balanced detection, good default
  • High Threshold (0.9+): Only detects major anomalies, very strict

Use Cases

Fraud Detection

// Detect suspicious financial transactions const fraudAnomalies = await client.db.detectAnomalies({ table: "transactions", k: 15, threshold: 0.9, limit: 100 });

Data Quality Assurance

// Find records with potential data quality issues const qualityIssues = await client.db.detectAnomalies({ table: "user_profiles", k: 8, threshold: 0.7, limit: 50 });

Security Monitoring

// Detect unusual user behavior patterns const securityAnomalies = await client.db.detectAnomalies({ table: "user_activities", k: 12, threshold: 0.85, limit: 30 });

Business Intelligence

// Find interesting outliers in business data const businessAnomalies = await client.db.detectAnomalies({ table: "sales_data", k: 10, threshold: 0.75, limit: 20 });