# Claude Development Guide - NetStar Categorizer ## Project Overview **NetStar Categorizer** is a Node.js microservice that provides domain name/FQDN categorization using the inCompass NetStar SDK. It exposes both UDP and HTTP interfaces for real-time content classification with automatic daily database updates. ### Core Purpose - Categorize domains into standardized categories using NetStar's content database - Map raw NetStar category IDs to organization-specific category codes - Provide both simple and detailed categorization results with reputation/age ratings - Maintain updated categorization databases through automated cron jobs ## Key Architecture ### Technology Stack - **Runtime**: Node.js v14+ - **Web Framework**: Express.js v4 - **External SDK**: inCompass NetStar v3.1.0-2 (C++ library for categorization) - **Containerization**: Docker + Kubernetes - **CI/CD**: CircleCI - **Infrastructure**: Cloud-agnostic (DigitalOcean, Google Cloud, etc.) ### Core Services 1. **HTTP Server** (Port 3333) - REST API for categorization requests 2. **UDP Server** (Port 33333) - Legacy UDP interface for categorization 3. **Cron Service** - Daily database updates via Kubernetes CronJob 4. **Category Mapping** - NetStar ID → Organization Code conversion (singleton pattern) ## Project Structure ``` src/ ├── server.js # Entry point: initializes UDP + HTTP servers ├── app.js # Core logic: orchestrates categorization flow ├── client.js # UDP test client ├── cron.js # Scheduled database update logic ├── use-cases/ │ ├── get-category-use-case.js # Executes NetStar gcf1check command │ ├── category-converter-use-case.js # Maps NetStar IDs → org codes (singleton) │ ├── parse-detailed-category-use-case.js # Parses detailed output with ratings │ └── update-categories-use-case.js # Updates NetStar databases └── etc/ └── categories-mapping.json # Mapping table: NetStar ID → Zvelo codes deployment/ ├── deployment.yaml # Kubernetes deployment manifest └── staging/deployment.yaml # Staging-specific config .circleci/config.yml # CI/CD pipeline (builds, tests, deploys) Dockerfile # Container build specification package.json # Node dependencies + scripts makefile # Convenience commands test-detailed.http # HTTP endpoint test file ``` ## Language Standards ### Comments and Error Messages - ENGLISH ONLY **All code comments, error messages, log statements, and documentation must be written in English.** This includes: - ✅ Code comments explaining logic - ✅ Error messages and exception messages - ✅ Console.log, console.error, and logging statements - ✅ Variable and function names - ✅ Commit messages - ✅ Code review feedback - ✅ Documentation strings (JSDoc, etc.) **Examples:** ```javascript // Correct: English comment function mapCategoryId(netstarId) { if (!netstarId) { throw new Error('NetStar ID is required') } // Map to organization category code return categoryConverter.convert(netstarId) } // Incorrect: Portuguese comment function mapCategoryId(netstarId) { if (!netstarId) { throw new Error('ID do NetStar é obrigatório') // ❌ ERROR IN PORTUGUESE } // Mapear para código de categoria da organização // ❌ COMMENT IN PORTUGUESE return categoryConverter.convert(netstarId) } ``` ## Coding Conventions ### Code Style - Use **ES6 syntax** (const/let, arrow functions, template literals) - **No semicolons** in new code (already established pattern) - Functional/modular design - keep files focused on single responsibility - Use **singleton pattern** for shared state (see: `CategoryConverterUseCase`) - **All comments in English** - see Language Standards section ### Use Case Pattern - Each business operation gets a dedicated use case class in `src/use-cases/` - Use case classes should have a clear, single responsibility - Example: ```javascript class GetCategoryUseCase { async execute(fqdn) { // implementation } } ``` ### Environment Variables - Defined in `.env` (create from `.env.example`) - `UDP_PORT=33333` - UDP server listen port - `HTTP_PORT=3333` - HTTP server listen port ## HTTP API Endpoints ### `POST /` - Basic Categorization - **Input**: `{"fqdn": "example.com"}` - **Output**: `{"result": [10009, 10010]}` (array of category IDs) - **Use Case**: Quick lookups when detailed info not needed ### `POST /detailed` - Detailed Categorization - **Input**: `{"fqdn": "example.com"}` - **Output**: Full category info with reputation score, age rating, primary/secondary categories, and human-readable names - **Use Case**: Comprehensive categorization for security decisions ### UDP Server (Port 33333) - **Input**: Raw domain string (e.g., `"example.com"`) - **Output**: JSON-formatted result (same as HTTP `/` endpoint) - **Legacy Interface**: Maintained for backward compatibility ## Development Workflow ### Setup & Running Locally ```bash # Install dependencies npm install # Development with auto-reload npm run dev:server # Watch mode for server changes npm run dev:client # Run UDP client for testing # Production npm start # Start both servers # NetStar service commands (Linux system) make gcf1-start # Start NetStar service make gcf1-download # Download category databases make gcf1-update # Update category databases ``` ### Testing APIs Use `test-detailed.http` in VS Code REST Client extension: 1. Open the file 2. Click "Send Request" on each endpoint 3. View responses in the side panel ### Git Workflow - **Main Branch**: `main` - production stable code - **Development Branch**: `development` - feature integration - **Feature Branches**: Create from `development`, merge back via PR - Recent commits show HTTP approach implementation and cron job additions - **Commit Messages**: Must be in English ## Deployment ### Docker - **Base Image**: Ubuntu 22.04 - **Includes**: Boost libraries, Node.js, NetStar SDK - **Exposes**: Port 3000 (UDP) - Build: `docker build -t netstar-categorizer .` ### Kubernetes (Production) - **Namespace**: `blackdice` - **Deployment**: Single replica in appropriate cluster - **CronJob**: Daily database updates at 00:00 UTC - **Ingress**: `netstar-cat-dev.blackdice.ai` (DNS varies by environment) - **Branches to Environments**: - `development` → development cluster - `qa` → QA cluster - `staging` → staging cluster - `production` → production cluster - `gke-staging`, `gke-pov` → specific GKE clusters ### CI/CD Pipeline (CircleCI) - Automatically builds Docker image on push - Tags image with commit SHA - Deploys to appropriate Kubernetes cluster based on branch - Release deployments via git tags ## Key Technical Details ### Category Mapping System - **Source**: NetStar SDK returns numeric category IDs (e.g., 101, 102) - **Mapping File**: `src/etc/categories-mapping.json` - **Target**: Maps to organization's Zvelo pattern codes (e.g., 10075, 10078) - **Singleton Implementation**: `CategoryConverterUseCase` maintains single instance across app - **Example Mapping**: - NetStar 101 (Illegal Activities) → Zvelo 10075 - NetStar 201 (Terrorism/Extremists) → Zvelo 10018 ### NetStar SDK Integration - **Command**: `gcf1check` - queries the NetStar database for domain categorization - **Child Process**: Executed via Node.js `child_process` module - **Output Parsing**: Raw output parsed into JSON structure - **Detailed Mode**: Includes reputation scores and age ratings in output ### Automatic Database Updates - **Mechanism**: Kubernetes CronJob at 0 0 * * * (daily at midnight UTC) - **Fallback**: Manual update via `make gcf1-update` - **Purpose**: Keeps categorization database current with latest NetStar classifications ## Common Tasks ### Adding a New Endpoint 1. Create a corresponding use case in `src/use-cases/` 2. Add route in `src/app.js` that calls the use case 3. Export and test in `test-detailed.http` 4. Update this guide if it's a significant feature 5. Ensure all error messages and comments are in English ### Updating Category Mappings 1. Modify `src/etc/categories-mapping.json` with new ID mappings 2. Restart the service (singleton will reload on next request) 3. Test with both HTTP and UDP interfaces ### Debugging - **Server Logs**: Check Docker/Kubernetes logs for errors - **Cron Logs**: View Kubernetes CronJob logs for database update issues - **UDP Testing**: Use `npm run dev:client` to test directly - **HTTP Testing**: Use `test-detailed.http` with VS Code REST Client - **Error Messages**: All error logs must be in English ### Troubleshooting - **NetStar Service Not Running**: Run `make gcf1-start` - **Stale Categories**: Manually run `make gcf1-update` or wait for cron job - **Port Conflicts**: Ensure ports 3333 (HTTP) and 33333 (UDP) are available - **Docker Build Issues**: Check that Boost C++ libraries are installed correctly ## Current Development Status ### Recent Work - ✅ HTTP server implementation (alongside UDP) - ✅ Detailed categorization with reputation/age ratings - ✅ Cron job for automated daily updates - ✅ Singleton category converter pattern - 🔄 Work in Progress: - `playground.js` - experimental/testing code - `parse-detailed-category-use-case.js` - new detailed parsing feature - Enhanced `server.js` - expanded server capabilities ### Known Modified Files - `playground.js` - development/testing (can be cleaned up) - `src/server.js` - recent enhancements - `makefile` - new convenience commands - `test-detailed.http` - expanded test coverage ## Guidelines for Contributions 1. **Follow Existing Patterns**: Use use-case classes, follow module structure 2. **Test Before Committing**: Use `test-detailed.http` for API changes 3. **Update Mappings Properly**: Edit `categories-mapping.json`, not hardcode values 4. **Document Breaking Changes**: Update this guide if architecture changes 5. **Keep CircleCI Happy**: Ensure Docker build succeeds and K8s deployment configs are valid 6. **Don't Skip Steps**: Always test UDP and HTTP interfaces for categorization changes 7. **Language Standards**: All comments, error messages, and logs must be in English ## Resources & External Documentation - **NetStar SDK**: Installed in Docker, documentation in inCompass SDK v3.1.0-2 - **Express.js**: https://expressjs.com - **Node.js Child Process**: https://nodejs.org/api/child_process.html - **Kubernetes**: https://kubernetes.io/docs - **CircleCI**: Configuration at `.circleci/config.yml`