Commit 6dd6ac30 authored by Jason Frisvold's avatar Jason Frisvold
Browse files

- Remove design doc

- - Moved to gitlab wiki
parent 7c527504
# Skynet design document v0.1.050614
## Overview
This system is designed to automate the process of scanning specified subnets on a scheduled basis. Data from each scan is stored for historical purposes allowing the administrator to identify change over time. Automated reporting can be used to provide regular updates on security status as well as alert the administrator when anomalies are detected. Scans are compared against a previous “baseline” scan, defined for each scanning host.
This is a “living” document and is not feature complete. Check back often for updated versions.
## General Features / Components
### Cloud component
* Local timing and parameters data
* Local control daemon
* SQLite database for data storage
#### Config file layout
* Flat file
* 9 lines
> Action - add, delete, modify
> Server ID - ID of entry from server
> Minute - * or [0-59] or [0-59],[0-59],etc
> Hour - * or [0-23] or [0-23],[0-23],etc
> Day - * or [1-31] or [1-31],[1-31],etc
> Month - * or [1-12] or [1-12],[1-12],etc
> Override Flag - True (1) or False (0)
> IP Range - CIDR Notation
> Options - Valid NMAP CLI options
### Control system
* Control nmap options ?
* Identify IP blocks (CIDR notation)
* Stored as integers ?
* How do we handle IPv6 ?
* Control scan time
* Can we feed back into this?
* ie, auto-adjust scan times if they are running over
* How many over-time scans before changes are made ?
* Notify the administrator of the change
* Push parameters to cloud
* Do we do this on the fly?
* Has to open a connection for every scan
* Do we store timing data on the remotes?
* Timing data possibly sensitive
* Should be able to define multiple destinations
* Retrieve finished data from the cloud
* Dynamically determine retreival time based on scan times ?
* Create ndiff files?
* This can be done quickly on the fly, so maybe not worth the effort
* Use cloud APIs to start remote servers
* Save $$, no local timing data necessary
### Visualization system
* MySQL database of ndiff data?
* Might be overkill
* Text-based ndiff data?
* Easy to keep this encrypted
* Combination of both?
* MySQL for metadata
* Text for full scan data
* PHP graphing of changes
* Time per scan
* Number of hosts per scan
* Number of ports per scan
* Change in hosts/ports per scan
### Reporting subsystem
* Automated reporting
* Reports triggered by control daemon
* Multiple report types
* Pre-defined
* Custom
* Configurable email address
* non-emailed reports?
* On the fly reports
* GUI only reports
## Component Detail
### Cloud Component
The "cloud" piece of this software is the dumb workhorse piece of the system. Setup should be minimal and easily deployed on disparate systems.
Instructions are delivered via flat text files placed into a configuration directory. The local system parses these files and builds a localized timing table for spawning processes. This localized data is stored in a sqlite database. The incoming configuration files identify what can be added and what can be removed from the timing table. Incoming files are JSON encoded and contain the timing information, ip range, nmap options, override flags, and the id assigned by the server.
A spawning daemon is responsible for reading the timing table and spawning new processes at the appropriate time. New scans are spawned as separate processes with their PID being noted by the spawning daemon. The spawning daemon should identify if the previous scan process has completed prior to starting a new process. In the event of an existing process, the daemon should identify if the process is still running (PID exists) or if it has completed or died. It should send an appropriate notification to the administrator for existing PIDs if an override flag is not set. In the case of a completed process, the new scan should be spawned as requested. If the PID still exists, the spawning daemon should only spawn the process if there's an override flag set. This gives the administrator control to run scans on a tighter schedule when the run-time of a single scan may exceed the period of time between scans.
Finished scans should encrypt the scan results using a public GPG key and the plain text version of the file should be scrubbed. (Can we encrypt on the fly as the scan is running?) All completed files are stored in a holding area until the central processing system retrieves them. After retrieval, reports are scrubbed from the system.
#### Timing Table Format
server_id INT,
minute TEXT, // 0-59
hour TEXT, // 0-23
day TEXT, // 1-31
month TEXT, // 1-12
override_flag BOOLEAN,
ip_range TEXT,
nmap_options TEXT
CREATE TABLE spawned (
server_id INT,
start_time INT,
pid INT,
overtime BOOLEAN
CREATE TABLE spawn_log (
server_id INT,
start_time INT,
end_time INT,
status INT
### Control System
The control system is the central brain of the scanning system. It is responsible for interacting with the administrator, pushing schedule data to the scanning systems, and retrieving scan data from the scanning systems.
The control system stores all schedule data in a MySQL database. New schedule configurations are pushed out on an automated basis as the administrator chooses. Data is pushed to the remote systems via a simple SCP process. Automated retrieval of data occurs on a timed schedule. Retrieved data should be decrypted using a private GPG key. Metadata from the scans is added to a MySQL database and the raw scan data is stored in a predefined directory structure. Files should be named appropriately to indicate date and time of scan so manual interaction with files is simplified.
The control system is a non-interactive system designed to run in the background. All interactions with the control system are handled via the administrative web interface.
## Web Interface
### Administrative Interface
* Frequency of config pushes
* Frequency of data retrievals
* Add/Modify/Delete
** Servers
** Spawn tasks
### Visualization System
The visualation system is essentially the GUI front end for the system. It allows administrators GUI access to the control system for scheduling scans and reports as well as setting a new baseline. It also provides a visualization of the data being reported from the scanning systems. The visualization system provides graphical views of data such as actve hosts, active ports, filtered ports, and average scan times. The graphical system will use the metadata stored in the MySQL database to generate these graphs.
Possible graphs :
* Number of hosts per scan
* Number of ports per scan
* Time per scan
* Changes per scan
### Reporting Subsystem
Pre-defined automated reports can be scheduled withing the reporting subsystem. Reports are associated with one or more timed scans and run on a predefined schedule. Results from the scans are summarized and presented in an easily digestible format. Reports can be delivered via email, or viewed via CLI or GUI output. Reports should be cached for a period of time allowing quick retrieval of past reports.
Reports should include various statistics relevant to the scanning. The time taken for the scan, number of hosts and ports identified, new ports found, old ports removed.
## Back End design
### MySQL Database Definition
id int, // auto-increment ID
server_ip int, // IP of the server
description text, // Description of the server
ssh_key text, // Location of the SSH key to connect to the server
gpg_key text, // Location of the GPG key to decrypt data
id int,
options text,
override tinyint,
id int,
cloud_id int,
spawn_id int,
hour int,
minute int,
day int,
month int,
CREATE TABLE spawn_log (
id INT,
cloud_id INT,
spawn_id INT,
start_time INT,
end_time INT,
status INT,
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment