11 KiB
11 KiB
Claude Development Guide - NetStar Categorizer
Project Overview
NetStar Categorizer is a Node.js microservice that provides domain name/FQDN categorization using the inCompass NetStar SDK. It exposes both UDP and HTTP interfaces for real-time content classification with automatic daily database updates.
Core Purpose
- Categorize domains into standardized categories using NetStar's content database
- Map raw NetStar category IDs to organization-specific category codes
- Provide both simple and detailed categorization results with reputation/age ratings
- Maintain updated categorization databases through automated cron jobs
Key Architecture
Technology Stack
- Runtime: Node.js v14+
- Web Framework: Express.js v4
- External SDK: inCompass NetStar v3.1.0-2 (C++ library for categorization)
- Containerization: Docker + Kubernetes
- CI/CD: CircleCI
- Infrastructure: Cloud-agnostic (DigitalOcean, Google Cloud, etc.)
Core Services
- HTTP Server (Port 3333) - REST API for categorization requests
- UDP Server (Port 33333) - Legacy UDP interface for categorization
- Cron Service - Daily database updates via Kubernetes CronJob
- Category Mapping - NetStar ID → Organization Code conversion (singleton pattern)
Project Structure
src/
├── server.js # Entry point: initializes UDP + HTTP servers
├── app.js # Core logic: orchestrates categorization flow
├── client.js # UDP test client
├── cron.js # Scheduled database update logic
├── use-cases/
│ ├── get-category-use-case.js # Executes NetStar gcf1check command
│ ├── category-converter-use-case.js # Maps NetStar IDs → org codes (singleton)
│ ├── parse-detailed-category-use-case.js # Parses detailed output with ratings
│ └── update-categories-use-case.js # Updates NetStar databases
└── etc/
└── categories-mapping.json # Mapping table: NetStar ID → Zvelo codes
deployment/
├── deployment.yaml # Kubernetes deployment manifest
└── staging/deployment.yaml # Staging-specific config
.circleci/config.yml # CI/CD pipeline (builds, tests, deploys)
Dockerfile # Container build specification
package.json # Node dependencies + scripts
makefile # Convenience commands
test-detailed.http # HTTP endpoint test file
Language Standards
Comments and Error Messages - ENGLISH ONLY
All code comments, error messages, log statements, and documentation must be written in English.
This includes:
- ✅ Code comments explaining logic
- ✅ Error messages and exception messages
- ✅ Console.log, console.error, and logging statements
- ✅ Variable and function names
- ✅ Commit messages
- ✅ Code review feedback
- ✅ Documentation strings (JSDoc, etc.)
Examples:
// Correct: English comment
function mapCategoryId(netstarId) {
if (!netstarId) {
throw new Error('NetStar ID is required')
}
// Map to organization category code
return categoryConverter.convert(netstarId)
}
// Incorrect: Portuguese comment
function mapCategoryId(netstarId) {
if (!netstarId) {
throw new Error('ID do NetStar é obrigatório') // ❌ ERROR IN PORTUGUESE
}
// Mapear para código de categoria da organização // ❌ COMMENT IN PORTUGUESE
return categoryConverter.convert(netstarId)
}
Coding Conventions
Code Style
- Use ES6 syntax (const/let, arrow functions, template literals)
- No semicolons in new code (already established pattern)
- Functional/modular design - keep files focused on single responsibility
- Use singleton pattern for shared state (see:
CategoryConverterUseCase) - All comments in English - see Language Standards section
Use Case Pattern
- Each business operation gets a dedicated use case class in
src/use-cases/ - Use case classes should have a clear, single responsibility
- Example:
class GetCategoryUseCase { async execute(fqdn) { // implementation } }
Environment Variables
- Defined in
.env(create from.env.example) UDP_PORT=33333- UDP server listen portHTTP_PORT=3333- HTTP server listen port
HTTP API Endpoints
POST / - Basic Categorization
- Input:
{"fqdn": "example.com"} - Output:
{"result": [10009, 10010]}(array of category IDs) - Use Case: Quick lookups when detailed info not needed
POST /detailed - Detailed Categorization
- Input:
{"fqdn": "example.com"} - Output: Full category info with reputation score, age rating, primary/secondary categories, and human-readable names
- Use Case: Comprehensive categorization for security decisions
UDP Server (Port 33333)
- Input: Raw domain string (e.g.,
"example.com") - Output: JSON-formatted result (same as HTTP
/endpoint) - Legacy Interface: Maintained for backward compatibility
Development Workflow
Setup & Running Locally
# Install dependencies
npm install
# Development with auto-reload
npm run dev:server # Watch mode for server changes
npm run dev:client # Run UDP client for testing
# Production
npm start # Start both servers
# NetStar service commands (Linux system)
make gcf1-start # Start NetStar service
make gcf1-download # Download category databases
make gcf1-update # Update category databases
Testing APIs
Use test-detailed.http in VS Code REST Client extension:
- Open the file
- Click "Send Request" on each endpoint
- View responses in the side panel
Git Workflow
- Main Branch:
main- production stable code - Development Branch:
development- feature integration - Feature Branches: Create from
development, merge back via PR - Recent commits show HTTP approach implementation and cron job additions
- Commit Messages: Must be in English
Deployment
Docker
- Base Image: Ubuntu 22.04
- Includes: Boost libraries, Node.js, NetStar SDK
- Exposes: Port 3000 (UDP)
- Build:
docker build -t netstar-categorizer .
Kubernetes (Production)
- Namespace:
blackdice - Deployment: Single replica in appropriate cluster
- CronJob: Daily database updates at 00:00 UTC
- Ingress:
netstar-cat-dev.blackdice.ai(DNS varies by environment) - Branches to Environments:
development→ development clusterqa→ QA clusterstaging→ staging clusterproduction→ production clustergke-staging,gke-pov→ specific GKE clusters
CI/CD Pipeline (CircleCI)
- Automatically builds Docker image on push
- Tags image with commit SHA
- Deploys to appropriate Kubernetes cluster based on branch
- Release deployments via git tags
Key Technical Details
Category Mapping System
- Source: NetStar SDK returns numeric category IDs (e.g., 101, 102)
- Mapping File:
src/etc/categories-mapping.json - Target: Maps to organization's Zvelo pattern codes (e.g., 10075, 10078)
- Singleton Implementation:
CategoryConverterUseCasemaintains single instance across app - Example Mapping:
- NetStar 101 (Illegal Activities) → Zvelo 10075
- NetStar 201 (Terrorism/Extremists) → Zvelo 10018
NetStar SDK Integration
- Command:
gcf1check- queries the NetStar database for domain categorization - Child Process: Executed via Node.js
child_processmodule - Output Parsing: Raw output parsed into JSON structure
- Detailed Mode: Includes reputation scores and age ratings in output
Automatic Database Updates
- Mechanism: Kubernetes CronJob at 0 0 * * * (daily at midnight UTC)
- Fallback: Manual update via
make gcf1-update - Purpose: Keeps categorization database current with latest NetStar classifications
Common Tasks
Adding a New Endpoint
- Create a corresponding use case in
src/use-cases/ - Add route in
src/app.jsthat calls the use case - Export and test in
test-detailed.http - Update this guide if it's a significant feature
- Ensure all error messages and comments are in English
Updating Category Mappings
- Modify
src/etc/categories-mapping.jsonwith new ID mappings - Restart the service (singleton will reload on next request)
- Test with both HTTP and UDP interfaces
Debugging
- Server Logs: Check Docker/Kubernetes logs for errors
- Cron Logs: View Kubernetes CronJob logs for database update issues
- UDP Testing: Use
npm run dev:clientto test directly - HTTP Testing: Use
test-detailed.httpwith VS Code REST Client - Error Messages: All error logs must be in English
Troubleshooting
- NetStar Service Not Running: Run
make gcf1-start - Stale Categories: Manually run
make gcf1-updateor wait for cron job - Port Conflicts: Ensure ports 3333 (HTTP) and 33333 (UDP) are available
- Docker Build Issues: Check that Boost C++ libraries are installed correctly
Current Development Status
Recent Work
- ✅ HTTP server implementation (alongside UDP)
- ✅ Detailed categorization with reputation/age ratings
- ✅ Cron job for automated daily updates
- ✅ Singleton category converter pattern
- 🔄 Work in Progress:
playground.js- experimental/testing codeparse-detailed-category-use-case.js- new detailed parsing feature- Enhanced
server.js- expanded server capabilities
Known Modified Files
playground.js- development/testing (can be cleaned up)src/server.js- recent enhancementsmakefile- new convenience commandstest-detailed.http- expanded test coverage
Guidelines for Contributions
- Follow Existing Patterns: Use use-case classes, follow module structure
- Test Before Committing: Use
test-detailed.httpfor API changes - Update Mappings Properly: Edit
categories-mapping.json, not hardcode values - Document Breaking Changes: Update this guide if architecture changes
- Keep CircleCI Happy: Ensure Docker build succeeds and K8s deployment configs are valid
- Don't Skip Steps: Always test UDP and HTTP interfaces for categorization changes
- Language Standards: All comments, error messages, and logs must be in English
Resources & External Documentation
- NetStar SDK: Installed in Docker, documentation in inCompass SDK v3.1.0-2
- Express.js: https://expressjs.com
- Node.js Child Process: https://nodejs.org/api/child_process.html
- Kubernetes: https://kubernetes.io/docs
- CircleCI: Configuration at
.circleci/config.yml