De-Coder’s Ring

Consumable Security and Technology

Category: technical (page 2 of 10)

Kent Brake – Interview

This is another exciting podcast for the Decoder’s Ring series!

My friend Kent Brake joined me with a wealth of knowledge around cybersecurity and a few tools we can use to get a new network and host-based system monitoring.

Kent’s a seasons security architect and is currently working as a Solutions Architect for a company that you probably know.

In this podcast, we talk about how to start building a network security solution.  We discuss Bro, Suricata, Elasticsearch, Greylog, Splunk and all kinds of fun stuff you can use to create a new monitoring system.

OSQuery?  Yep, talked about that too!

Subscribe here : https://fauie.com/feed/podcast

Technology: The first class citizen

I’ve spent the last few days at Capital One’s Software Engineering conference.   How cool is that?   Hundreds of techs folks gathering for a few days to discuss areas of technology.   These are modern stacks of technology, processes and new paradigms.

For me, I’ve been able to watch about a half dozen talks on Machine Learning, the programming language Go and encryption.  The speakers were excellent, and, if I play my cards right, I’m going to work to get a few of them on here as guest bloggers!

What topics would you want to hear about?

Threat Hunting: tcpdump

This is the second video in my ‘Threat Hunting: With open source software”.   You can find the first video here:  Threat Hunting: The Network and PCAP

This video dives a bit deeper into monitoring networks.  First, we’ll go over how to monitor a modern network, some tips and tricks to help avoid gotchas.

For instance, ever wonder why you can’t see other computers traffic on your network switch? yeah, we talk about that!

We eventually work our way towards using tcpdump.   We’ll monitor live traffic and then store it to disk.  Lots of content in here, so let’s get started!

Inverting the Message Bus

I had a conversation this morning, where I just (maybe I’m slow) realized how Apache Kafka has inverted the responsibility in the world of message passing.

Traditional enterprise services busses ( Wikipedia: Enterprise Service Bus ) typically have some smarts built in.  The bus itself routes messages, transforms messages and orchestrates actions based on message attributes.  This was the first attempt at building a great mediation layer in an enterprise.  Some advantages of the traditional ESB were:

  • Producer/Consume Language Agnostic
  • Input/Output format changes (XML, JSON, etc)
  • Defined routing and actions on messages

The challenges were typical for traditional enterprise software.  Scaling was a mess and licenses could be cost prohibitive to scale.   This meant lower adoption and general loss of the advantages for smaller projects or customers.

Talk about a huge and complex stack!   Look at this picture for the ‘core’ capabilities of an Enterprise Service Bus:

 

ESB Component Hive

ESB Component Hive

Now let’s take a look at Apache Kafka.

Kafka Diagram

Kafka Diagram

Ok, that’s a lot of arrows, and lines and block, oh my.

BUT, The thing to notice here that’s SUPER important, is that they’re all outside the Kafka box.  Kafka isn’t smart.  In fact, Kafka was designed to be dumb.    There is no message routing, there’s no message format changes, nothing.    The big box in the middle is dumb.    It scales really well, and stays dumb.

In fact, the only ‘type’ of communication that Kafka has is publish/subscribe.   One(to-many) clients produce messages to a topic.    They send in data.   Doesn’t matter if it’s JSON, XML, yiddish, etc.   It goes to the topic.   Kafka batches them up, and ‘persists’ them as a log file.   That’s it.  A big old data file on disk.  The smarts of Kafka comes next…  One Consumer Group (which may be MANY actual instances of software, but with the same group ID) subscribe to a topic… or more than one topic.    Kafka (Zookeeper help) remembers which client in the client group has seen which block of messages.  Ok, that sounds confusing. I’ll try again.

Kafka coordinates which blocks of data get to which client.   If the clients are in the same client group, then data is only sent out once to a member of the client group.    More than one client group can subscribe to a topic, so you can have multiple consumer processes for each topic.

Now, instead of the message bus sending messages from one function to another, that work is left up to the clients.   For instance, let’s say you have to ingest an email from a mail server and test it to see if there’s a malicious reply-to address.

First, the message comes in as plain text to the ‘email_ingest‘ topic.   This can be published to by many clients reading data from many servers.  Let’s assume Logstash.  Logstash will send the message in as plain text.    After the message is in the ‘email_ingest‘ topic, another program will transform that message to JSON.  This program subscribes to ‘email_ingest‘, pulls each message, transforms to JSON, and publishes it back to another topic ‘email_jsonified‘.

The last piece of the puzzle is the code that calls the email hygiene function.   This piece of code takes the longest, due to calling an external API, so needs to scale horizontally the most.    This function reads from ‘email_jsonified‘, calls the external API, and if there’s a malicious IP or reply-to detected, publishes the message on the last topic ’email_alert’.   ‘email_alert‘ is subscribed to by another Logstash instance, to push the message into Elasticsearch for visualization in Kibana.

Sounds complicated right?

The big difference here, is that the intelligence moved into the clients.   The clients need to handle the orchestration, error handling, reporting, etc.   That has some pros and cons.  It’s great that clients can now be written in many technologies, and there is more ‘freedom’ for a development group to do their own thing in a language or framework they’re best suited for.   That can also be bad.  Errors add a new challenge.  Dead letter queues can be a pain to manage, but, again, it puts the onus on the client organization (in the case of a distributed set of teams) to handle their own errors.

Kafka scales horizontally on a small footprint really easily.  It’s mostly a network IO bound system, instead of a CPU or memory bound system.  It’s important to keep an eye on disk space, memory and CPU, but they tend not to be an issue if you set up your retention policies in an environment appropriate manner.

Reach out if you have any questions

Do you prefer RabbitMQ?  ActiveMQ?  Kafka?  (They’re not the same, but similar!)

Thoughts on Equifax – Tokenization

The past week’s news cycle has been covered by information regarding the Equifax breach.  There are rumors on how the breach happens, rumors and accusations (arm chair judge Judy and executioner) about improper stock sales, and a half million class action law suits.

(I exaggerate on numbers)

This post isn’t going to talk about breaches in general, it’s not going to talk about this breach from a technical standpoint… what I’m going to talk about is something different.

If your bank gets hacked, crap, that sucks.   If your doctor/insurance gets hacked, crap, that sucks.   They’ll sign you up for identity protection, and you’ll forget about it.

The interesting part about Equifax, is that you don’t have an account with them.  You’ve never interacted with Equifax, unless your credit score… but guess what?  That’s not the information that got stolen.

The data that got stolen, the super important data about your life, was given to them by other entities.  Your bank gave them your information.  Your credit card company gave them your information.  Did you know they did that?

You gave them permission to in your contract, but, did you realize you did? (Of course not, no one reads the fine print!)

Now that you’re ticked off because someone else shared your data, to someone who couldn’t protect it… what do you do?

Nothing.

That’s the way credit reporting and your credit score work.   For you to have a credit score, entities have to report to Equifax, Transunion and Experian.

What could have been done?   Tokenization.

There’s a concept in security, and more specifically secure data storage and transmission called tokenization.  Essentially, a developer trades a secret piece of data (SSN, account numbers, etc) and swaps it for a token.   The token can later be swapped back for the initial piece of data if necessary.  This is great inside a company, like a bank.  Instead of storing a SSN in a operational database, you store the token.  The only time you need the SSN could be printing paper that needs it, or filing a legal document.

There’s a challenge with tokenization for credit reporting.  Every bank needs to report someone like ME as the same person.    Bank A can’t have token 1234 for me, while Bank B reports me as ABCD.  They both need to report me as the same.   If Bank A and Bank B can both report on me as ABCD1234 then Equifax et al, don’t need to know my SSN.   The banks I have a relationship do.

How do we accomplish this?  That’s the big big challenge…   could be my next billion dollar idea… (yeah, I just gave it away, but, execution is 99% of the challenge)

 

Older posts Newer posts

© 2017 De-Coder’s Ring

Theme by Anders NorenUp ↑