What Does This Endpoint Do?
This endpoint uses advanced machine learning techniques to identify records that are significantly different from the majority of your data. It analyzes vector embeddings to find outliers that might indicate fraud, errors, or interesting patterns that warrant investigation.When to Use Anomaly Detection
You’ll find this endpoint useful when you need to:- Fraud detection: Identify suspicious transactions, accounts, or activities
- Data quality assurance: Find records that might contain errors or inconsistencies
- Security monitoring: Detect unusual patterns that could indicate security breaches
- Business intelligence: Discover outliers that might represent new opportunities or risks
- Compliance monitoring: Identify records that don’t conform to expected patterns
- Research and analysis: Find interesting edge cases or unusual data points
How It Works
- You specify a table to analyze for anomalies
- The system retrieves vector embeddings for all records in the table
- It performs K-nearest neighbors analysis to calculate anomaly scores
- Records with high anomaly scores are identified as outliers
- Results include nearest neighbors and distance metrics for each anomaly
Code Examples
Example 1: Basic Anomaly Detection
This example shows how to detect anomalies in a transactions table.- Node.js
- Python
- Go
- cURL
Example 2: Fraud Detection with Custom Parameters
This example shows how to tune parameters for fraud detection in financial transactions.- Node.js
- Python
- Go
- cURL
Example 3: Data Quality Check
This example shows how to use anomaly detection for data quality assurance.- Node.js
- Python
- Go
- cURL
Request Body Explained
Table to analyze for anomalies.Example: “transactions”
Number of nearest neighbors to consider for anomaly detection. Range: 1-50, default: 10. Higher values provide more stable results but may miss subtle anomalies.
Minimum anomaly score threshold (0-1). Only records above this threshold are considered anomalies. Default: 0.8. Higher values detect fewer, more extreme anomalies.
Maximum number of anomalies to return. Range: 1-100, default: 50.
Response Fields Explained
true if anomaly detection was successful, false otherwise.Array of detected anomalies, each containing:
record: The anomalous record dataanomaly_score: Anomaly score (0-1, higher = more unusual)avg_distance: Average distance to K nearest neighborsnearest_neighbors: Array of nearest neighbor records with distances
Total number of records analyzed.
Number of anomalies detected.
Parameters used for the analysis (k, threshold).
Analysis execution time in milliseconds.
Example Response
Common Errors and How to Fix Them
| Error | Cause | Solution |
|---|---|---|
| ”Table not found” | The specified table doesn’t exist | Check your table name for typos |
| ”No embeddings found” | The table doesn’t have vector embeddings | Ensure your table has been processed for embeddings |
| ”Insufficient data” | Not enough records for meaningful analysis | Ensure you have at least 20+ records in the table |
| ”K value too large” | K parameter exceeds available data | Reduce K value or ensure you have more records |
| ”Threshold too high” | No records meet the anomaly threshold | Lower the threshold value (try 0.6-0.7) |
| “Unauthorized” | Invalid or missing API key | Check that you’re using a valid API key |
Tips for Better Anomaly Detection
- Start with moderate parameters: Begin with k=10, threshold=0.8 and adjust based on results
- Consider your use case:
- Fraud detection: Higher threshold (0.9+), more neighbors (k=15+)
- Data quality: Moderate threshold (0.7), moderate neighbors (k=8-10)
- General analysis: Lower threshold (0.6), fewer neighbors (k=5-8)
- Review nearest neighbors: Examine what “normal” records look like to understand anomalies
- Monitor execution time: Larger datasets take longer to analyze
- Validate results: Manually review detected anomalies to ensure they make sense
- Iterate and refine: Adjust parameters based on the quality of detected anomalies
Parameter Tuning Guidelines
K (Number of Neighbors)
- Low K (5-8): More sensitive to local variations, may detect subtle anomalies
- Medium K (10-15): Balanced approach, good for most use cases
- High K (20+): More stable, focuses on major outliers, less sensitive to noise
Threshold (Anomaly Score)
- Low Threshold (0.6-0.7): Detects more anomalies, including minor outliers
- Medium Threshold (0.8): Balanced detection, good default
- High Threshold (0.9+): Only detects major anomalies, very strict

