Detect Anomalies

Identifies anomalous or outlier records using K-nearest neighbors analysis on vector embeddings. This endpoint is perfect for fraud detection, data quality checks, and identifying unusual patterns in your data.

POST https://api.worqhat.com/db/detect-anomalies

What Does This Endpoint Do?

This endpoint uses advanced machine learning techniques to identify records that are significantly different from the majority of your data. It analyzes vector embeddings to find outliers that might indicate fraud, errors, or interesting patterns that warrant investigation.

When to Use Anomaly Detection

You’ll find this endpoint useful when you need to:

Fraud detection: Identify suspicious transactions, accounts, or activities
Data quality assurance: Find records that might contain errors or inconsistencies
Security monitoring: Detect unusual patterns that could indicate security breaches
Business intelligence: Discover outliers that might represent new opportunities or risks
Compliance monitoring: Identify records that don’t conform to expected patterns
Research and analysis: Find interesting edge cases or unusual data points

How It Works

You specify a table to analyze for anomalies
The system retrieves vector embeddings for all records in the table
It performs K-nearest neighbors analysis to calculate anomaly scores
Records with high anomaly scores are identified as outliers
Results include nearest neighbors and distance metrics for each anomaly

Code Examples

Example 1: Basic Anomaly Detection

This example shows how to detect anomalies in a transactions table.

Node.js
Python
Go
cURL

import Worqhat from 'worqhat';

// Initialize the client with your API key
const client = new Worqhat({
  apiKey: process.env.WORQHAT_API_KEY, // Always use environment variables for API keys
});

async function detectAnomalies() {
  try {
    // Call the detectAnomalies method
    const response = await client.db.detectAnomalies({
      table: "transactions",
      k: 10,
      threshold: 0.8,
      limit: 50,
      environment: process.env.WORQHAT_ENVIRONMENT || 'production' // Defaults to production
    });
    
    // Handle the successful response
    console.log(`Analyzed ${response.total_records} records`);
    console.log(`Found ${response.anomaly_count} anomalies`);
    console.log('Anomalies:', response.anomalies);
    return response;
  } catch (error) {
    // Handle any errors
    console.error('Error detecting anomalies:', error.message);
  }
}

// Call the function
detectAnomalies();

Example 2: Fraud Detection with Custom Parameters

This example shows how to tune parameters for fraud detection in financial transactions.

Node.js
Python
Go
cURL

import Worqhat from 'worqhat';

// Initialize the client with your API key
const client = new Worqhat({
  apiKey: process.env.WORQHAT_API_KEY,
});

async function detectFraud() {
  try {
    // Detect fraud with stricter parameters
    const response = await client.db.detectAnomalies({
      table: "financial_transactions",
      k: 15,              // More neighbors for better accuracy
      threshold: 0.9,     // Higher threshold for stricter detection
      limit: 100,         // More results to investigate
      environment: process.env.WORQHAT_ENVIRONMENT || 'production' // Defaults to production
    });
    
    // Handle the successful response
    console.log(`Fraud detection analysis complete`);
    console.log(`Total transactions analyzed: ${response.total_records}`);
    console.log(`Potential fraud cases: ${response.anomaly_count}`);
    
    // Process each anomaly
    response.anomalies.forEach((anomaly, index) => {
      console.log(`\nAnomaly ${index + 1}:`);
      console.log(`- Transaction ID: ${anomaly.record.transaction_id}`);
      console.log(`- Amount: $${anomaly.record.amount}`);
      console.log(`- Anomaly Score: ${anomaly.anomaly_score}`);
      console.log(`- Average Distance: ${anomaly.avg_distance}`);
    });
    return response;
  } catch (error) {
    console.error('Error detecting fraud:', error.message);
  }
}

detectFraud();

Example 3: Data Quality Check

This example shows how to use anomaly detection for data quality assurance.

Node.js
Python
Go
cURL

import Worqhat from 'worqhat';

// Initialize the client with your API key
const client = new Worqhat({
  apiKey: process.env.WORQHAT_API_KEY,
});

async function checkDataQuality() {
  try {
    // Check data quality with moderate parameters
    const response = await client.db.detectAnomalies({
      table: "user_profiles",
      k: 8,               // Moderate number of neighbors
      threshold: 0.7,      // Moderate threshold for data quality
      limit: 30,           // Reasonable number of results
      environment: process.env.WORQHAT_ENVIRONMENT || 'production' // Defaults to production
    });
    
    // Handle the successful response
    console.log(`Data quality check complete`);
    console.log(`Total profiles analyzed: ${response.total_records}`);
    console.log(`Potential data quality issues: ${response.anomaly_count}`);
    
    // Analyze anomalies for data quality issues
    response.anomalies.forEach((anomaly, index) => {
      console.log(`\nData Quality Issue ${index + 1}:`);
      console.log(`- User ID: ${anomaly.record.user_id}`);
      console.log(`- Profile: ${JSON.stringify(anomaly.record, null, 2)}`);
      console.log(`- Anomaly Score: ${anomaly.anomaly_score}`);
      console.log(`- Nearest Neighbors: ${anomaly.nearest_neighbors.length}`);
    });
    return response;
  } catch (error) {
    console.error('Error checking data quality:', error.message);
  }
}

checkDataQuality();

Request Body Explained

table

string

required

Table to analyze for anomalies.Example: “transactions”

number

Number of nearest neighbors to consider for anomaly detection. Range: 1-50, default: 10. Higher values provide more stable results but may miss subtle anomalies.

threshold

number

Minimum anomaly score threshold (0-1). Only records above this threshold are considered anomalies. Default: 0.8. Higher values detect fewer, more extreme anomalies.

limit

number

Maximum number of anomalies to return. Range: 1-100, default: 50.

Response Fields Explained

success

boolean

true if anomaly detection was successful, false otherwise.

anomalies

array

Array of detected anomalies, each containing:

record: The anomalous record data
anomaly_score: Anomaly score (0-1, higher = more unusual)
avg_distance: Average distance to K nearest neighbors
nearest_neighbors: Array of nearest neighbor records with distances

total_records

number

Total number of records analyzed.

anomaly_count

number

Number of anomalies detected.

parameters

object

Parameters used for the analysis (k, threshold).

executionTime

number

Analysis execution time in milliseconds.

Example Response

{
  "success": true,
  "anomalies": [
    {
      "record": {
        "id": "txn_123",
        "amount": 50000.00,
        "currency": "USD",
        "merchant": "Unknown",
        "location": "Remote"
      },
      "anomaly_score": 0.923,
      "avg_distance": 0.456,
      "nearest_neighbors": [
        {
          "record": {
            "id": "txn_456",
            "amount": 25.50,
            "currency": "USD",
            "merchant": "Coffee Shop",
            "location": "Local"
          },
          "distance": 0.234
        },
        {
          "record": {
            "id": "txn_789",
            "amount": 15.75,
            "currency": "USD",
            "merchant": "Restaurant",
            "location": "Local"
          },
          "distance": 0.312
        }
      ]
    }
  ],
  "total_records": 1000,
  "anomaly_count": 15,
  "parameters": {
    "k": 10,
    "threshold": 0.8
  },
  "executionTime": 1247
}

Common Errors and How to Fix Them

Error	Cause	Solution
”Table not found”	The specified table doesn’t exist	Check your table name for typos
”No embeddings found”	The table doesn’t have vector embeddings	Ensure your table has been processed for embeddings
”Insufficient data”	Not enough records for meaningful analysis	Ensure you have at least 20+ records in the table
”K value too large”	K parameter exceeds available data	Reduce K value or ensure you have more records
”Threshold too high”	No records meet the anomaly threshold	Lower the threshold value (try 0.6-0.7)
“Unauthorized”	Invalid or missing API key	Check that you’re using a valid API key

Tips for Better Anomaly Detection

Start with moderate parameters: Begin with k=10, threshold=0.8 and adjust based on results
Consider your use case:
- Fraud detection: Higher threshold (0.9+), more neighbors (k=15+)
- Data quality: Moderate threshold (0.7), moderate neighbors (k=8-10)
- General analysis: Lower threshold (0.6), fewer neighbors (k=5-8)
Review nearest neighbors: Examine what “normal” records look like to understand anomalies
Monitor execution time: Larger datasets take longer to analyze
Validate results: Manually review detected anomalies to ensure they make sense
Iterate and refine: Adjust parameters based on the quality of detected anomalies

Parameter Tuning Guidelines

K (Number of Neighbors)

Low K (5-8): More sensitive to local variations, may detect subtle anomalies
Medium K (10-15): Balanced approach, good for most use cases
High K (20+): More stable, focuses on major outliers, less sensitive to noise

Threshold (Anomaly Score)

Low Threshold (0.6-0.7): Detects more anomalies, including minor outliers
Medium Threshold (0.8): Balanced detection, good default
High Threshold (0.9+): Only detects major anomalies, very strict

Use Cases

Fraud Detection

// Detect suspicious financial transactions
const fraudAnomalies = await client.db.detectAnomalies({
  table: "transactions",
  k: 15,
  threshold: 0.9,
  limit: 100
});

Data Quality Assurance

// Find records with potential data quality issues
const qualityIssues = await client.db.detectAnomalies({
  table: "user_profiles",
  k: 8,
  threshold: 0.7,
  limit: 50
});

Security Monitoring

// Detect unusual user behavior patterns
const securityAnomalies = await client.db.detectAnomalies({
  table: "user_activities",
  k: 12,
  threshold: 0.85,
  limit: 30
});

Business Intelligence

// Find interesting outliers in business data
const businessAnomalies = await client.db.detectAnomalies({
  table: "sales_data",
  k: 10,
  threshold: 0.75,
  limit: 20
});

Introduction

System

Database

Workflows

Storage

Policies

What Does This Endpoint Do?

When to Use Anomaly Detection

How It Works

Code Examples

Example 1: Basic Anomaly Detection

Example 2: Fraud Detection with Custom Parameters

Example 3: Data Quality Check

Request Body Explained

Response Fields Explained

Example Response

Common Errors and How to Fix Them

Tips for Better Anomaly Detection

Parameter Tuning Guidelines

K (Number of Neighbors)

Threshold (Anomaly Score)

Use Cases

Fraud Detection

Data Quality Assurance

Security Monitoring

Business Intelligence

Introduction

System

Database

Workflows

Storage

Policies

​What Does This Endpoint Do?

​When to Use Anomaly Detection

​How It Works

​Code Examples

​Example 1: Basic Anomaly Detection

​Example 2: Fraud Detection with Custom Parameters

​Example 3: Data Quality Check

​Request Body Explained

​Response Fields Explained

​Example Response

​Common Errors and How to Fix Them

​Tips for Better Anomaly Detection

​Parameter Tuning Guidelines

​K (Number of Neighbors)

​Threshold (Anomaly Score)

​Use Cases

​Fraud Detection

​Data Quality Assurance

​Security Monitoring

​Business Intelligence

What Does This Endpoint Do?

When to Use Anomaly Detection

How It Works

Code Examples

Example 1: Basic Anomaly Detection

Example 2: Fraud Detection with Custom Parameters

Example 3: Data Quality Check

Request Body Explained

Response Fields Explained

Example Response

Common Errors and How to Fix Them

Tips for Better Anomaly Detection

Parameter Tuning Guidelines

K (Number of Neighbors)

Threshold (Anomaly Score)

Use Cases

Fraud Detection

Data Quality Assurance

Security Monitoring

Business Intelligence