2 Commits

Author SHA1 Message Date
daniel muniz
dfda9eac9a implemented detailed endpoint 2026-04-14 09:00:25 -03:00
Daniel Muniz
0bc1adb738 Update categories-mapping.json 2025-06-18 09:05:24 -03:00
9 changed files with 536 additions and 64 deletions

276
claude.md Normal file
View File

@@ -0,0 +1,276 @@
# Claude Development Guide - NetStar Categorizer
## Project Overview
**NetStar Categorizer** is a Node.js microservice that provides domain name/FQDN categorization using the inCompass NetStar SDK. It exposes both UDP and HTTP interfaces for real-time content classification with automatic daily database updates.
### Core Purpose
- Categorize domains into standardized categories using NetStar's content database
- Map raw NetStar category IDs to organization-specific category codes
- Provide both simple and detailed categorization results with reputation/age ratings
- Maintain updated categorization databases through automated cron jobs
## Key Architecture
### Technology Stack
- **Runtime**: Node.js v14+
- **Web Framework**: Express.js v4
- **External SDK**: inCompass NetStar v3.1.0-2 (C++ library for categorization)
- **Containerization**: Docker + Kubernetes
- **CI/CD**: CircleCI
- **Infrastructure**: Cloud-agnostic (DigitalOcean, Google Cloud, etc.)
### Core Services
1. **HTTP Server** (Port 3333) - REST API for categorization requests
2. **UDP Server** (Port 33333) - Legacy UDP interface for categorization
3. **Cron Service** - Daily database updates via Kubernetes CronJob
4. **Category Mapping** - NetStar ID → Organization Code conversion (singleton pattern)
## Project Structure
```
src/
├── server.js # Entry point: initializes UDP + HTTP servers
├── app.js # Core logic: orchestrates categorization flow
├── client.js # UDP test client
├── cron.js # Scheduled database update logic
├── use-cases/
│ ├── get-category-use-case.js # Executes NetStar gcf1check command
│ ├── category-converter-use-case.js # Maps NetStar IDs → org codes (singleton)
│ ├── parse-detailed-category-use-case.js # Parses detailed output with ratings
│ └── update-categories-use-case.js # Updates NetStar databases
└── etc/
└── categories-mapping.json # Mapping table: NetStar ID → Zvelo codes
deployment/
├── deployment.yaml # Kubernetes deployment manifest
└── staging/deployment.yaml # Staging-specific config
.circleci/config.yml # CI/CD pipeline (builds, tests, deploys)
Dockerfile # Container build specification
package.json # Node dependencies + scripts
makefile # Convenience commands
test-detailed.http # HTTP endpoint test file
```
## Language Standards
### Comments and Error Messages - ENGLISH ONLY
**All code comments, error messages, log statements, and documentation must be written in English.**
This includes:
- ✅ Code comments explaining logic
- ✅ Error messages and exception messages
- ✅ Console.log, console.error, and logging statements
- ✅ Variable and function names
- ✅ Commit messages
- ✅ Code review feedback
- ✅ Documentation strings (JSDoc, etc.)
**Examples:**
```javascript
// Correct: English comment
function mapCategoryId(netstarId) {
if (!netstarId) {
throw new Error('NetStar ID is required')
}
// Map to organization category code
return categoryConverter.convert(netstarId)
}
// Incorrect: Portuguese comment
function mapCategoryId(netstarId) {
if (!netstarId) {
throw new Error('ID do NetStar é obrigatório') // ❌ ERROR IN PORTUGUESE
}
// Mapear para código de categoria da organização // ❌ COMMENT IN PORTUGUESE
return categoryConverter.convert(netstarId)
}
```
## Coding Conventions
### Code Style
- Use **ES6 syntax** (const/let, arrow functions, template literals)
- **No semicolons** in new code (already established pattern)
- Functional/modular design - keep files focused on single responsibility
- Use **singleton pattern** for shared state (see: `CategoryConverterUseCase`)
- **All comments in English** - see Language Standards section
### Use Case Pattern
- Each business operation gets a dedicated use case class in `src/use-cases/`
- Use case classes should have a clear, single responsibility
- Example:
```javascript
class GetCategoryUseCase {
async execute(fqdn) {
// implementation
}
}
```
### Environment Variables
- Defined in `.env` (create from `.env.example`)
- `UDP_PORT=33333` - UDP server listen port
- `HTTP_PORT=3333` - HTTP server listen port
## HTTP API Endpoints
### `POST /` - Basic Categorization
- **Input**: `{"fqdn": "example.com"}`
- **Output**: `{"result": [10009, 10010]}` (array of category IDs)
- **Use Case**: Quick lookups when detailed info not needed
### `POST /detailed` - Detailed Categorization
- **Input**: `{"fqdn": "example.com"}`
- **Output**: Full category info with reputation score, age rating, primary/secondary categories, and human-readable names
- **Use Case**: Comprehensive categorization for security decisions
### UDP Server (Port 33333)
- **Input**: Raw domain string (e.g., `"example.com"`)
- **Output**: JSON-formatted result (same as HTTP `/` endpoint)
- **Legacy Interface**: Maintained for backward compatibility
## Development Workflow
### Setup & Running Locally
```bash
# Install dependencies
npm install
# Development with auto-reload
npm run dev:server # Watch mode for server changes
npm run dev:client # Run UDP client for testing
# Production
npm start # Start both servers
# NetStar service commands (Linux system)
make gcf1-start # Start NetStar service
make gcf1-download # Download category databases
make gcf1-update # Update category databases
```
### Testing APIs
Use `test-detailed.http` in VS Code REST Client extension:
1. Open the file
2. Click "Send Request" on each endpoint
3. View responses in the side panel
### Git Workflow
- **Main Branch**: `main` - production stable code
- **Development Branch**: `development` - feature integration
- **Feature Branches**: Create from `development`, merge back via PR
- Recent commits show HTTP approach implementation and cron job additions
- **Commit Messages**: Must be in English
## Deployment
### Docker
- **Base Image**: Ubuntu 22.04
- **Includes**: Boost libraries, Node.js, NetStar SDK
- **Exposes**: Port 3000 (UDP)
- Build: `docker build -t netstar-categorizer .`
### Kubernetes (Production)
- **Namespace**: `blackdice`
- **Deployment**: Single replica in appropriate cluster
- **CronJob**: Daily database updates at 00:00 UTC
- **Ingress**: `netstar-cat-dev.blackdice.ai` (DNS varies by environment)
- **Branches to Environments**:
- `development` → development cluster
- `qa` → QA cluster
- `staging` → staging cluster
- `production` → production cluster
- `gke-staging`, `gke-pov` → specific GKE clusters
### CI/CD Pipeline (CircleCI)
- Automatically builds Docker image on push
- Tags image with commit SHA
- Deploys to appropriate Kubernetes cluster based on branch
- Release deployments via git tags
## Key Technical Details
### Category Mapping System
- **Source**: NetStar SDK returns numeric category IDs (e.g., 101, 102)
- **Mapping File**: `src/etc/categories-mapping.json`
- **Target**: Maps to organization's Zvelo pattern codes (e.g., 10075, 10078)
- **Singleton Implementation**: `CategoryConverterUseCase` maintains single instance across app
- **Example Mapping**:
- NetStar 101 (Illegal Activities) → Zvelo 10075
- NetStar 201 (Terrorism/Extremists) → Zvelo 10018
### NetStar SDK Integration
- **Command**: `gcf1check` - queries the NetStar database for domain categorization
- **Child Process**: Executed via Node.js `child_process` module
- **Output Parsing**: Raw output parsed into JSON structure
- **Detailed Mode**: Includes reputation scores and age ratings in output
### Automatic Database Updates
- **Mechanism**: Kubernetes CronJob at 0 0 * * * (daily at midnight UTC)
- **Fallback**: Manual update via `make gcf1-update`
- **Purpose**: Keeps categorization database current with latest NetStar classifications
## Common Tasks
### Adding a New Endpoint
1. Create a corresponding use case in `src/use-cases/`
2. Add route in `src/app.js` that calls the use case
3. Export and test in `test-detailed.http`
4. Update this guide if it's a significant feature
5. Ensure all error messages and comments are in English
### Updating Category Mappings
1. Modify `src/etc/categories-mapping.json` with new ID mappings
2. Restart the service (singleton will reload on next request)
3. Test with both HTTP and UDP interfaces
### Debugging
- **Server Logs**: Check Docker/Kubernetes logs for errors
- **Cron Logs**: View Kubernetes CronJob logs for database update issues
- **UDP Testing**: Use `npm run dev:client` to test directly
- **HTTP Testing**: Use `test-detailed.http` with VS Code REST Client
- **Error Messages**: All error logs must be in English
### Troubleshooting
- **NetStar Service Not Running**: Run `make gcf1-start`
- **Stale Categories**: Manually run `make gcf1-update` or wait for cron job
- **Port Conflicts**: Ensure ports 3333 (HTTP) and 33333 (UDP) are available
- **Docker Build Issues**: Check that Boost C++ libraries are installed correctly
## Current Development Status
### Recent Work
- ✅ HTTP server implementation (alongside UDP)
- ✅ Detailed categorization with reputation/age ratings
- ✅ Cron job for automated daily updates
- ✅ Singleton category converter pattern
- 🔄 Work in Progress:
- `playground.js` - experimental/testing code
- `parse-detailed-category-use-case.js` - new detailed parsing feature
- Enhanced `server.js` - expanded server capabilities
### Known Modified Files
- `playground.js` - development/testing (can be cleaned up)
- `src/server.js` - recent enhancements
- `makefile` - new convenience commands
- `test-detailed.http` - expanded test coverage
## Guidelines for Contributions
1. **Follow Existing Patterns**: Use use-case classes, follow module structure
2. **Test Before Committing**: Use `test-detailed.http` for API changes
3. **Update Mappings Properly**: Edit `categories-mapping.json`, not hardcode values
4. **Document Breaking Changes**: Update this guide if architecture changes
5. **Keep CircleCI Happy**: Ensure Docker build succeeds and K8s deployment configs are valid
6. **Don't Skip Steps**: Always test UDP and HTTP interfaces for categorization changes
7. **Language Standards**: All comments, error messages, and logs must be in English
## Resources & External Documentation
- **NetStar SDK**: Installed in Docker, documentation in inCompass SDK v3.1.0-2
- **Express.js**: https://expressjs.com
- **Node.js Child Process**: https://nodejs.org/api/child_process.html
- **Kubernetes**: https://kubernetes.io/docs
- **CircleCI**: Configuration at `.circleci/config.yml`

8
makefile Normal file
View File

@@ -0,0 +1,8 @@
gcf1-start:
cd /usr/local/gcf1 && sbin/gcf1 start
gcf1-download:
cd /usr/local/gcf1 && bin/gcf1dbmng.sh etc urldb_download
gcf1-update:
cd /usr/local/gcf1 && bin/gcf1dbmng.sh etc urldb_update

93
package-lock.json generated
View File

@@ -10,7 +10,7 @@
"license": "ISC",
"dependencies": {
"dotenv": "^16.4.5",
"express": "^4.21.0"
"express": "^4.19.2"
}
},
"node_modules/accepts": {
@@ -31,9 +31,9 @@
"integrity": "sha512-PCVAQswWemu6UdxsDFFX/+gVeYqKAod3D3UVm91jHwynguOwAvYPhx8nNlM++NqRcK6CxxpUafjmhIdKiHibqg=="
},
"node_modules/body-parser": {
"version": "1.20.3",
"resolved": "https://registry.npmjs.org/body-parser/-/body-parser-1.20.3.tgz",
"integrity": "sha512-7rAxByjUMqQ3/bHJy7D6OGXvx/MMc4IqBn/X0fcM1QUcAItpZrBEYhWGem+tzXH90c+G01ypMcYJBO9Y30203g==",
"version": "1.20.2",
"resolved": "https://registry.npmjs.org/body-parser/-/body-parser-1.20.2.tgz",
"integrity": "sha512-ml9pReCu3M61kGlqoTm2umSXTlRTuGTx0bfYj+uIUKKYycG5NtSbeetV3faSU6R7ajOPw0g/J1PvK4qNy7s5bA==",
"dependencies": {
"bytes": "3.1.2",
"content-type": "~1.0.5",
@@ -43,7 +43,7 @@
"http-errors": "2.0.0",
"iconv-lite": "0.4.24",
"on-finished": "2.4.1",
"qs": "6.13.0",
"qs": "6.11.0",
"raw-body": "2.5.2",
"type-is": "~1.6.18",
"unpipe": "1.0.0"
@@ -169,9 +169,9 @@
"integrity": "sha512-WMwm9LhRUo+WUaRN+vRuETqG89IgZphVSNkdFgeb6sS/E4OrDIN7t48CAewSHXc6C8lefD8KKfr5vY61brQlow=="
},
"node_modules/encodeurl": {
"version": "2.0.0",
"resolved": "https://registry.npmjs.org/encodeurl/-/encodeurl-2.0.0.tgz",
"integrity": "sha512-Q0n9HRi4m6JuGIV1eFlmvJB7ZEVxu93IrMyiMsGC0lrMJMWzRgx6WGquyfQgZVb31vhGgXnfmPNNXmxnOkRBrg==",
"version": "1.0.2",
"resolved": "https://registry.npmjs.org/encodeurl/-/encodeurl-1.0.2.tgz",
"integrity": "sha512-TPJXq8JqFaVYm2CWmPvnP2Iyo4ZSM7/QKcSmuMLDObfpH5fi7RUGmd/rTDf+rut/saiDiQEeVTNgAmJEdAOx0w==",
"engines": {
"node": ">= 0.8"
}
@@ -209,36 +209,36 @@
}
},
"node_modules/express": {
"version": "4.21.0",
"resolved": "https://registry.npmjs.org/express/-/express-4.21.0.tgz",
"integrity": "sha512-VqcNGcj/Id5ZT1LZ/cfihi3ttTn+NJmkli2eZADigjq29qTlWi/hAQ43t/VLPq8+UX06FCEx3ByOYet6ZFblng==",
"version": "4.19.2",
"resolved": "https://registry.npmjs.org/express/-/express-4.19.2.tgz",
"integrity": "sha512-5T6nhjsT+EOMzuck8JjBHARTHfMht0POzlA60WV2pMD3gyXw2LZnZ+ueGdNxG+0calOJcWKbpFcuzLZ91YWq9Q==",
"dependencies": {
"accepts": "~1.3.8",
"array-flatten": "1.1.1",
"body-parser": "1.20.3",
"body-parser": "1.20.2",
"content-disposition": "0.5.4",
"content-type": "~1.0.4",
"cookie": "0.6.0",
"cookie-signature": "1.0.6",
"debug": "2.6.9",
"depd": "2.0.0",
"encodeurl": "~2.0.0",
"encodeurl": "~1.0.2",
"escape-html": "~1.0.3",
"etag": "~1.8.1",
"finalhandler": "1.3.1",
"finalhandler": "1.2.0",
"fresh": "0.5.2",
"http-errors": "2.0.0",
"merge-descriptors": "1.0.3",
"merge-descriptors": "1.0.1",
"methods": "~1.1.2",
"on-finished": "2.4.1",
"parseurl": "~1.3.3",
"path-to-regexp": "0.1.10",
"path-to-regexp": "0.1.7",
"proxy-addr": "~2.0.7",
"qs": "6.13.0",
"qs": "6.11.0",
"range-parser": "~1.2.1",
"safe-buffer": "5.2.1",
"send": "0.19.0",
"serve-static": "1.16.2",
"send": "0.18.0",
"serve-static": "1.15.0",
"setprototypeof": "1.2.0",
"statuses": "2.0.1",
"type-is": "~1.6.18",
@@ -250,12 +250,12 @@
}
},
"node_modules/finalhandler": {
"version": "1.3.1",
"resolved": "https://registry.npmjs.org/finalhandler/-/finalhandler-1.3.1.tgz",
"integrity": "sha512-6BN9trH7bp3qvnrRyzsBz+g3lZxTNZTbVO2EV1CS0WIcDbawYVdYvGflME/9QP0h0pYlCDBCTjYa9nZzMDpyxQ==",
"version": "1.2.0",
"resolved": "https://registry.npmjs.org/finalhandler/-/finalhandler-1.2.0.tgz",
"integrity": "sha512-5uXcUVftlQMFnWC9qu/svkWv3GTd2PfUhK/3PLkYNAe7FbqJMt3515HaxE6eRL74GdsriiwujiawdaB1BpEISg==",
"dependencies": {
"debug": "2.6.9",
"encodeurl": "~2.0.0",
"encodeurl": "~1.0.2",
"escape-html": "~1.0.3",
"on-finished": "2.4.1",
"parseurl": "~1.3.3",
@@ -411,12 +411,9 @@
}
},
"node_modules/merge-descriptors": {
"version": "1.0.3",
"resolved": "https://registry.npmjs.org/merge-descriptors/-/merge-descriptors-1.0.3.tgz",
"integrity": "sha512-gaNvAS7TZ897/rVaZ0nMtAyxNyi/pdbjbAwUpFQpN70GqnVfOiXpeUUMKRBmzXaSQ8DdTX4/0ms62r2K+hE6mQ==",
"funding": {
"url": "https://github.com/sponsors/sindresorhus"
}
"version": "1.0.1",
"resolved": "https://registry.npmjs.org/merge-descriptors/-/merge-descriptors-1.0.1.tgz",
"integrity": "sha512-cCi6g3/Zr1iqQi6ySbseM1Xvooa98N0w31jzUYrXPX2xqObmFGHJ0tQ5u74H3mVh7wLouTseZyYIq39g8cNp1w=="
},
"node_modules/methods": {
"version": "1.1.2",
@@ -500,9 +497,9 @@
}
},
"node_modules/path-to-regexp": {
"version": "0.1.10",
"resolved": "https://registry.npmjs.org/path-to-regexp/-/path-to-regexp-0.1.10.tgz",
"integrity": "sha512-7lf7qcQidTku0Gu3YDPc8DJ1q7OOucfa/BSsIwjuh56VU7katFvuM8hULfkwB3Fns/rsVF7PwPKVw1sl5KQS9w=="
"version": "0.1.7",
"resolved": "https://registry.npmjs.org/path-to-regexp/-/path-to-regexp-0.1.7.tgz",
"integrity": "sha512-5DFkuoqlv1uYQKxy8omFBeJPQcdoE07Kv2sferDCrAq1ohOU+MSDswDIbnx3YAM60qIOnYa53wBhXW0EbMonrQ=="
},
"node_modules/proxy-addr": {
"version": "2.0.7",
@@ -517,11 +514,11 @@
}
},
"node_modules/qs": {
"version": "6.13.0",
"resolved": "https://registry.npmjs.org/qs/-/qs-6.13.0.tgz",
"integrity": "sha512-+38qI9SOr8tfZ4QmJNplMUxqjbe7LKvvZgWdExBOmd+egZTtjLB67Gu0HRX3u/XOq7UU2Nx6nsjvS16Z9uwfpg==",
"version": "6.11.0",
"resolved": "https://registry.npmjs.org/qs/-/qs-6.11.0.tgz",
"integrity": "sha512-MvjoMCJwEarSbUYk5O+nmoSzSutSsTwF85zcHPQ9OrlFoZOYIjaqBAJIqIXjptyD5vThxGq52Xu/MaJzRkIk4Q==",
"dependencies": {
"side-channel": "^1.0.6"
"side-channel": "^1.0.4"
},
"engines": {
"node": ">=0.6"
@@ -577,9 +574,9 @@
"integrity": "sha512-YZo3K82SD7Riyi0E1EQPojLz7kpepnSQI9IyPbHHg1XXXevb5dJI7tpyN2ADxGcQbHG7vcyRHk0cbwqcQriUtg=="
},
"node_modules/send": {
"version": "0.19.0",
"resolved": "https://registry.npmjs.org/send/-/send-0.19.0.tgz",
"integrity": "sha512-dW41u5VfLXu8SJh5bwRmyYUbAoSB3c9uQh6L8h/KtsFREPWpbX1lrljJo186Jc4nmci/sGUZ9a0a0J2zgfq2hw==",
"version": "0.18.0",
"resolved": "https://registry.npmjs.org/send/-/send-0.18.0.tgz",
"integrity": "sha512-qqWzuOjSFOuqPjFe4NOsMLafToQQwBSOEpS+FwEt3A2V3vKubTquT3vmLTQpFgMXp8AlFWFuP1qKaJZOtPpVXg==",
"dependencies": {
"debug": "2.6.9",
"depd": "2.0.0",
@@ -599,28 +596,20 @@
"node": ">= 0.8.0"
}
},
"node_modules/send/node_modules/encodeurl": {
"version": "1.0.2",
"resolved": "https://registry.npmjs.org/encodeurl/-/encodeurl-1.0.2.tgz",
"integrity": "sha512-TPJXq8JqFaVYm2CWmPvnP2Iyo4ZSM7/QKcSmuMLDObfpH5fi7RUGmd/rTDf+rut/saiDiQEeVTNgAmJEdAOx0w==",
"engines": {
"node": ">= 0.8"
}
},
"node_modules/send/node_modules/ms": {
"version": "2.1.3",
"resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz",
"integrity": "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA=="
},
"node_modules/serve-static": {
"version": "1.16.2",
"resolved": "https://registry.npmjs.org/serve-static/-/serve-static-1.16.2.tgz",
"integrity": "sha512-VqpjJZKadQB/PEbEwvFdO43Ax5dFBZ2UECszz8bQ7pi7wt//PWe1P6MN7eCnjsatYtBT6EuiClbjSWP2WrIoTw==",
"version": "1.15.0",
"resolved": "https://registry.npmjs.org/serve-static/-/serve-static-1.15.0.tgz",
"integrity": "sha512-XGuRDNjXUijsUL0vl6nSD7cwURuzEgglbOaFuZM9g3kwDXOWVTck0jLzjPzGD+TazWbboZYu52/9/XPdUgne9g==",
"dependencies": {
"encodeurl": "~2.0.0",
"encodeurl": "~1.0.2",
"escape-html": "~1.0.3",
"parseurl": "~1.3.3",
"send": "0.19.0"
"send": "0.18.0"
},
"engines": {
"node": ">= 0.8.0"

View File

@@ -13,6 +13,6 @@
"description": "",
"dependencies": {
"dotenv": "^16.4.5",
"express": "^4.21.0"
"express": "^4.19.2"
}
}

View File

@@ -7,12 +7,12 @@ const { spawn, exec } = require("node:child_process")
// })
exec("echo '99999.incompass.netstar-inc.com' | bin/gcf1check.sh etc check_categorize_hybrid", { cwd: '/usr/local/gcf1' }, (error, stdout, stderr)=>{
exec("echo 'li12.pages.dev' | bin/gcf1check.sh etc check_categorize_hybrid", { cwd: '/usr/local/gcf1' }, (error, stdout, stderr)=>{
if (error) {
console.error(error)
return
}
console.log(stdout.trim())
console.log(stdout.trim().split(/\t|\s{2,}/))
})

View File

@@ -1074,7 +1074,7 @@
"id": "20097",
"description": "VPN",
"related": [
"10451"
"10460"
]
},
{

View File

@@ -5,6 +5,9 @@ const app = require('./app')
const bodyParser = require('body-parser')
const cron = require('./cron')
const express = require('express')
const { ParseDetailedCategoryUseCase } = require('./use-cases/parse-detailed-category-use-case')
const { CategoryConverterUseCase } = require('./use-cases/category-converter-use-case')
const categoriesMapping = require('./etc/categories-mapping.json')
const httpServer = express()
httpServer.use(bodyParser.json()) // for parsing application/json
@@ -14,6 +17,10 @@ cron()
const UDP_PORT = process.env.UDP_PORT
const HTTP_PORT = process.env.HTTP_PORT
// Initialize use cases
const categoryConverter = new CategoryConverterUseCase({ categoriesMapping })
const parseDetailedCategory = new ParseDetailedCategoryUseCase({ categoryConverter })
const server = dgram.createSocket('udp4');
@@ -69,6 +76,19 @@ httpServer.post('/', async (req, res) => {
})
httpServer.post('/detailed', async (req, res) => {
try {
const { fqdn } = req.body
const detailedResult = await parseDetailedCategory.execute(fqdn)
res.status(200).json(detailedResult)
} catch (err) {
console.error('Error in /detailed endpoint:', err)
res.status(500).json({ error: err.message || err })
}
})
httpServer.listen(HTTP_PORT, () => {
console.log('HTTP server listening 3333')
})

View File

@@ -0,0 +1,130 @@
const { exec } = require("node:child_process")
class ParseDetailedCategoryUseCase {
constructor({ categoryConverter }) {
this.categoryConverter = categoryConverter;
}
execute(domain) {
return new Promise((resolve, reject) => {
exec(`echo ${domain} | bin/gcf1check.sh etc check_categorize_hybrid`,
{ cwd: '/usr/local/gcf1' },
(error, stdout, stderr) => {
if (error) {
console.error(error);
reject(error);
return;
}
try {
const parsed = this.parseOutput(stdout);
resolve(parsed);
} catch (parseError) {
console.error('Parse error:', parseError);
reject(parseError);
}
});
});
}
parseOutput(output) {
// Extract all quoted strings in order
const quotedStrings = [];
const quoteRegex = /"([^"]*)"/g;
let match;
while ((match = quoteRegex.exec(output)) !== null) {
quotedStrings.push(match[1]);
}
// Split on whitespace
const parts = output.trim().split(/\t|\s{2,}/);
// Find numeric IDs by looking for numbers that appear in sequence
// After "Categorized", we have: count, primary_id, secondary_id, ..., reputation_score, ..., age_rating_score
// Primary ID is parts[2] (always third element after "Categorized" and count)
const primary = parts[2];
const primaryName = quotedStrings[0];
// Secondary ID is parts[4] (always fifth element)
const secondary = parts[4];
let secondaryName = '';
let quotedIndex = 1; // Start after primary name
// If secondary is not "0", it has a quoted name
if (secondary !== '0' && secondary !== '-') {
secondaryName = quotedStrings[quotedIndex];
quotedIndex++;
} else {
secondaryName = parts[5] === '-' ? '-' : parts[5];
}
// Find reputation score - it's a single digit that comes after some markers
// Look for the pattern: a digit followed by a quoted reputation name
let reputation = '';
let reputationName = '';
// Scan from parts[7] onwards to find reputation (it's usually around parts[8-11])
for (let i = 7; i < parts.length; i++) {
const part = parts[i];
// Look for single/double digit that's not a category ID
if (!isNaN(part) && part !== '-' && !part.includes('x') && !part.includes('|')) {
const num = parseInt(part);
// Reputation scores are typically 0-5, single digit
if (num >= 0 && num <= 5 && parts[i+1] !== '-' && !parts[i+1].includes('0x')) {
reputation = part;
reputationName = quotedStrings[quotedIndex];
quotedIndex++;
break;
}
}
}
// Find age rating - it's another single digit that comes after more markers
// After reputation, we should find age rating
let ageRating = '';
let ageRatingName = '';
for (let i = 12; i < parts.length; i++) {
const part = parts[i];
if (!isNaN(part) && part !== '-' && !part.includes('x') && !part.includes('|')) {
const num = parseInt(part);
if (num >= 0 && num <= 5) {
// Make sure it's not already used as reputation
if (part !== reputation || i > 10) {
ageRating = part;
ageRatingName = quotedStrings[quotedIndex];
break;
}
}
}
}
// Convert category IDs using the mapper
const primaryMapped = this.categoryConverter.execute(primary);
const secondaryMapped = secondary !== '-' && secondary !== '0' ?
this.categoryConverter.execute(secondary) : null;
// Build result array in the same format as the / endpoint (as strings)
const resultArray = [String(primaryMapped)];
if (secondaryMapped !== null) {
resultArray.push(String(secondaryMapped));
}
return {
result: resultArray,
primary: primaryMapped,
primary_name: primaryName,
secondary: secondaryMapped,
secondary_name: secondaryName,
reputation: parseInt(reputation),
reputation_name: reputationName,
age_rating: parseInt(ageRating),
age_rating_name: ageRatingName,
raw_output: output.trim()
};
}
}
module.exports = { ParseDetailedCategoryUseCase }

49
test-detailed.http Normal file
View File

@@ -0,0 +1,49 @@
@baseUrl = http://localhost:3333
### Test detailed endpoint with Facebook
POST {{baseUrl}}/detailed
Content-Type: application/json
{
"fqdn": "facebook.com"
}
### Test detailed endpoint with Google
POST {{baseUrl}}/
Content-Type: application/json
{
"fqdn": "tiktok.com"
}
### Test detailed endpoint with Reddit
POST {{baseUrl}}/detailed
Content-Type: application/json
{
"fqdn": "reddit.com"
}
### Test detailed endpoint with YouTube
POST {{baseUrl}}/detailed
Content-Type: application/json
{
"fqdn": "youtube.com"
}
### Test detailed endpoint with Twitter
POST {{baseUrl}}/detailed
Content-Type: application/json
{
"fqdn": "twitter.com"
}
### Test simple endpoint with Facebook (original endpoint)
POST {{baseUrl}}/
Content-Type: application/json
{
"fqdn": "facebook.com"
}