De-Coder’s Ring

Software doesn’t have to be hard

Month: August 2016

Stix, Taxii: Understanding Cybersecurity Intelligence

Cyber Intelligence Takes Balls

Cyber Intelligence Takes Balls

Introduction
I spent years building a packet capture and network forensics tool. Slicing and dicing packets makes sense to me. Headers, payloads, etc.. easy peasy (no, it’s not really easy, but like I said, years). Understanding complex data structures comes with the territory, and so far, I haven’t met a challenge that took me too long to understand.

Then I met Taxii. Then Stix. I forgot how painful XML was.

Taxii: Trusted Automated eXchange of Indicator Information

STIX: Structured Threat Information eXpression

FYI:  All the visualizations and screen shots are grabbed from Neo4J. The top rated and most used Graph database in the world.  My work has some specific requirements that I think are best suited with nodes, edges and finding relationships between data, so I thought I’d give it a shot.  Nice to see a built in browser that does some pretty fantastic drawing and layouts without any work on my part.  (Docker image to boot!)

Background
TAXII is a set of instructions or standards on how to transport intelligence data. The standard (now an OASIS standard), defines the interactions with a web server (HTTP(s)) requests to query and receive intelligence. For most use cases, there are three main phases of interactions with a server:

  1. Discovery – Figure out the ‘other’ end points, this is where you start
  2. Collection Information – Determine how the intelligence is stored. Think of collections as a repository, or grouping of intelligence data within the server.
  3. Poll (pull) – (or push, but I’m focusing on pull). Receive intelligence data for further processing. Poll requests will result in different STIX packages (more to come)

I’m not going to go into details on the interactions here, but the python library for TAXII does a good enough job to get you started.  It’s not perfectly clear, but it helps.

STIX defines some data structures around intelligence data.   Everything is organized in a ‘package’.  The package contains different pieces of information about the package and about the intelligence.  In this article, I’ll focus on ‘observables’ and ‘indicators’.  The items I won’t talk much about are:

  • TTPs:  Tactics, Techniques and Procedures.  What mechanisms are the ‘bad guys’ using.  Software packages, exploit kits, etc.
  • Exploit Target:  What’s being attacked
  • Threat Actor: If known, who/what’s attacking?
  • TLPs, Kill chains, etc

Observables

Observables are the facts.  They are pieces of data that you may see on your network, on a host, in an email, etc.  These can be URLs, email addresses, files (and their corresponding hashes), IP addresses, etc.   A fact is a fact.  There’s no context around it, it’s just a fact.

A URL that can be seen on a network

A URL that can be seen on a network

 

Indicators

Indicators are the ‘why’ around the facts.  These tell you what’s wrong with an IP address, or give the context and story about an email that was seen.

Context around an observable

Context around an observable

In the above pictures, you’ll see a malicious URL (hulk**, seriously, don’t follow it).   The observable component is the URL.  The indicator component tells us that it’s malicious.  The description above tells us that the intelligence center at phishtank.com identified the URL as part of a phishing scheme.

Source of data

All security analysts are well aware of some open source intelligence data. Emerging Threat, PhishTank, etc.  This data is updated regularly, and provided in their own format.  Since we’re talking about using TAXII to transport this data, we need an open source/free Taxii source.  Step in http://hailataxii.com

When you make a query against Hailataxii’s discovery end point, you learn the collections and poll URLs.  Additionally, the inbox URL, but we’re not using that today.  (Coincidentally, HAT’s URLs are all the same)

Once you query the collection information end point, you see approximately 11 (At the time of writing) collections.  I will list those below.  From there, we can make Poll requests to each collection, and start receiving (hundreds? Thousands?) of STIX packages.

STIX Package

Since I’m a network monitoring junky, I want to see the observables I can monitor.  Specifically IPs and URLs.  Parsing through the data, I find some interesting tidbits.  Some packages have observables at the top level, and some have observables as children of the indicators.  No big deal, we’ll keep it all and start storing/displaying.

Once it’s all parsed using some custom python (what a mess!), I’m able to start loading my Nodes and edges.  Straight forward, I build nodes for the Community (Hailataxii), the Collection, the Package, Indicators and Observables.  The observables can be related to the Indicator and/or the Package.

Community view from the top down

Community view from the top down

Yellow circle is the community, green circle is the collection, small blue circle is the package (told you it could be hundreds), purple is the indicator and reddish is the observable.

Indicators and Observables

Indicators and Observables

That’s about it!  Don’t forget to check out my last post on Suricata NSM fields to see how some of these observables can be found on a network.

Suricata NSM Fields

Please leave feedback if you have any questions!

 

 

 

 

 

 

 

Collections from Hail  A Taxii:

  1. guest.dataForLast_7daysOnly
  2. guest.EmergingThreats_rules
  3. guest.phishtank_com
  4. system.Default
  5. guest.EmergineThreats_rules
  6. guest.dshield_BlockList
  7. guest.Abuse_ch
  8. guest.MalwareDomainList_Hostlist
  9. guest.Lehigh_edu
  10. guest.CyberCrime_Tracker
  11. guest.blutmagie_de_torExits

Suricata NSM Fields

Value of NSM data

From https://suricata-ids.org

Suricata is a high performance Network IDS, IPS and Network Security Monitoring engine. Open Source and owned by a community run non-profit foundation, the Open Information Security Foundation. Suricata is developed by the OISF and its supporting vendors.

Snort was an early IDS that matched signatures against packets seen on a network.  Suricata is the next generation of open source tools that looks to expand on the capabilities of snort.  While it continues to monitor packets, and can ‘alert’ on matches (think bad IP address, or a sequence of bytes inside of a packet), it expands the capabilities by adding Network Security Monitoring.  NSM watches a network (or PCAP file), jams packet payloads together (in order), and does further analysis.  From the payloads, Suricata can extract HTTP, FTP, DNS, SMTP, SSL certificate info etc.

This data can provide invaluable insight into what’s going on in your company or in your home.   Store the data for future searches, monitor the data actively for immediate notification of some wrong doings or anything else you want to do.  NSM data allows an analyst to track the spreading of malware.  Track how a malicious email came through.

Beyond the meta-data, Suricata can also extract the files from monitored sessions.  These files can be analyzed, replayed or shoved into a sandbox and detonated.  Build your own!

Here’s a break down of fields available, but remember they’re not always there.   Be careful in your coding.

All records contain general layer 3/4 network information:

  • timestamp
  • protocol
  • in_iface
  • flow_id
  • proto
  • src_ip
  • src_port
  • dest_ip
  • dest_port
  • event_type

This covered TCP/IP, UDP, etc.  Each event that gets logged (check out /var/log/eve.json) , has this information and more.  “event_type” indicates the ‘rest’ of the important data in this NSM record.  Values in ‘event_type’ will be one of:

  • http
  • ssh
  • dns
  • smtp
  • email
  • tls
  • fileinfo*

HTTP:

  • accept
  • accept_charset
  • accept_encoding
  • accept_language
  • accept_datetime
  • authorization
  • cache_control
  • cookie
  • from
  • max_forwards
  • origin
  • pragma
  • proxy_authorization
  • range
  • te
  • via
  • x_requested_with
  • dat
  • x_forwarded_proto
  • x_authenticated_user
  • x_flash_version
  • accept_range
  • age
  • allow
  • connection
  • content_encoding
  • content_language
  • content_length
  • content_location
  • content_md5
  • content_range
  • content_type
  • date
  • etag
  • expires
  • last_modified
  • link
  • location
  • proxy_authenticate
  • referrer
  • refresh
  • retry_after
  • server
  • set_cookie
  • trailer
  • transfer_encoding
  • upgrade
  • vary
  • warning
  • www_authenticate

SSH:

Client/Server are child objects when this is parsed from JSON.

  • client
    •   proto_version
    •   software_version
  • server
    •   proto_version
    •   software_version

DNS:

  • tx_id
  • rrtype
  • rrname
  • type
  • id
  • rdata
  • ttl
  • rcode

SMTP/POP/IMAP:

  • reply_to
  • bcc
  • message_id
  • subject
  • x_mailer
  • user_agent
  • received
  • x_originating_ip
  • in_reply_to
  • references
  • importance
  • priority
  • sensitivity
  • organization
  • content_md5
  • date

TLS:

  • fingerprint
  • issuerdn
  • version
  • sni
  • subject

FILEINFO:

File info is special.  It can be associated with other types, like HTTP and SMTP/email.  Watch the object carefully, you’ll get a mix of fields

  • size
  • tx_id
  • state
  • stored
  • filename
  • magic
  • md5

© 2017 De-Coder’s Ring

Theme by Anders NorenUp ↑