Fauie Technology

eclectic blogging, technology and hobby farming

Category: software development (page 2 of 3)

Security: Code and Passwords

Developer’s jobs aren’t easy.  Constant deadlines, integrating new technologies… dealing with ‘Ted’ in the cube next to you that shouldn’t be eating those onion rings… you get it..  lots of issues.  #notsnowflakes

stressed out developer

stressed out developer

Now we’re forced to live in this modern world of devops.  No longer can we rely on system administrators to maintain systems.  No longer can we rely on release engineers to package and ship our code.  Now we own it all.

Some of us adapt.  Don’t get me wrong, it’s not an easy task.  Most of us don’t have enough linux-foo, or the ingrained processes to maintain a large elasticsearch cluster… but we cope.  We learn new skills, grow in breadth of knowledge… then that breadth gets deeper.  Holy cow, we’re valuable now!

Unfortunately, security still is not a top tier concern for most software engineers.  We have web exploits to worry about.  We have to worry about SQL Injection.  Stack overflows, kernel panics, all kinds of neat stuff… each of which is the beginning of a piece of vulnerable software.

The one that continues to kill me, and I have this feeling was behind a major breach in the US this week, has to do with account and environment credentials.  There are so many scenarios that require an application to know about credentials:

  1. Database connectivity
  2. External API/Service
  3. Mail servers

tons more.  how do we deal with it?

There are a few anti-patterns

… bad things.. don’t do these.

  1. Hard code the credentials in your code
  2. Use a configuration file, check it into source control
  3. Use environment variables in your public facing website to connect to your super secret database

Those are all dumb.  Don’t do anything.

What can we do?

Separation of connectivity.   Your web application shouldn’t call your database directly, especially if it’s a database with customer data, personally identifiable information or healthcare info.  That’d be dumb.   Connect your web application to an API layer , but still follow some of the ‘other’ advice below.

Supply the passwords at runtime

Use a password vault/key management system to supply passwords to an application.  Build that out into your application framework so your code doesn’t have to be aware of where the password came from.   A password vault is a high security system that allows an authorized application to make a secure request to get private information from.  For instance, your vault could store the ‘production customer database’ information. This could even be information about the host name, port, username and password of the database.

Different environments get different credentials

This one is pretty obvious, but sometimes even the best of us don’t follow this to a T..   ummm…. no, not me.. others..  yeah.. others.   Just like your web sites, always have different passwords for everything.  Don’t reuse credentials in a QA environment and a production environment.

Provision as much as you can in configuration

Putting configuration items, or items that MAY become configurable in code is a bad move.  You’re gonna have a bad time.

You're going to have a bad time

You’re going to have a bad time

Always use configuration files. In the example above, the configuration file would tell your application where to find the password vault. Not the passwords or even the database configuration.

Act like your data is exploited

This point goes kind of against the other development tips.  When building applications, always remember that there’s a chance that the database ends up on the internet.   No one wants to think about it, but, look at Equifax.  Look at Deloitte.  Look at Aetna. Target.  etc.   They got owned, and you very well may too.   Don’t live in fear, but, live in paranoia!

 

Neo4J Tutorial – Published Video!

You may have noticed that my stream of thought posts on Neo4J.  It’s pained my, cause you know, drawing the balls is fun.

Today, I get to announce a published video tutorial on Neo4J by Packt Publishing!

We developed an in depth course, covering a bunch of graph database and Neo4J problems, ranging from:

  • Installation
  • What is a graph database?
  • comparing to a relational database
  • Using Cypher Query Language (Cypher, CQL)
  • Looking at various functions in Neo4j
  • Query profiling

It was a ton of work, over a few months, but, the support from Packt was great.  I’m really looking forward to getting feedback on the course!

http://bit.ly/fauie-neo4j1

Packt Publishing -  Learning Neo4j Graphs and Cypher

Packt Publishing – Learning Neo4j Graphs and Cypher

 

 

 

Simple Tip: Provision an Elasticsearch Node Automatically!

I built out a new Elasticsearch 5.4 cluster today.

Typically, it’s a tedious task.   I haven’t invested in any sort of infrastructure automation technology, because, well, there aren’t enough hours in the day.  I remembered a trick a few of us came up with at a previous bank I used to work for.  Using a shell script in AWS S3, that gets downloaded in a user init script in EC2, and bam, off to the races!

I won’t give away any tricks here, since my boss would kick me… again, but, since this processed was used heavily by me and team previously, I don’t mind sharing.

We didn’t use it specifically for Elasticsearch, but, you can get the gist of how to use it in other applications.

First step, upload the script to AWS S3.    Here, I’ll use an example bucket of “notmybucket.com” – that’s my bucket, don’t try to own it.  for reals.

Let’s call the script “provision.es.sh”

The provision file can look something like this:

#!/bin/bash
yum -y remove java-1.7.0-openjdk
yum -y install java-1.8.0-openjdk   jq  wget
 
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.4.2.rpm
rpm -i elasticsearch-5.4.2.rpm
chkconfig --add elasticsearch
aws s3 cp s3://notmybucket.com/elasticsearch.data.yml.template /etc/elasticsearch/elasticsearch.yml
cd /usr/share/elasticsearch
./bin/elasticsearch-plugin install -b discovery-ec2
./bin/elasticsearch-plugin install -b repository-s3
./bin/elasticsearch-plugin install -b x-pack
cd /etc/elasticsearch
echo "EDIT ES Settings"


INSTANCE_ID=$(/opt/aws/bin/ec2-metadata  --instance-id | cut -f2 -d " ")
AVAIL_ZONE=$(curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | jq -r .availabilityZone)
REGION=$(curl -s http://169.254.169.254/latest/dynamic/instance-identity/document | jq -r .region)
NAME=$(aws ec2 describe-tags --filters "Name=resource-id,Values=$INSTANCE_ID" --region=$REGION --output=json | jq -r .Tags[0].Value)
echo $INSTANCE_ID
echo $AVAIL_ZONE
echo $REGION
echo $NAME


sed -i -- "s/INPUT_INSTANCE_NAME/$NAME/" elasticsearch.yml
sed -i -- "s/INPUT_RACK/$REGION/"        elasticsearch.yml
sed -i -- "s/REGION/$AVAIL_ZONE/"        elasticsearch.yml

cat elasticsearch.yml
echo "Now run service elasticsearch start"
service elasticsearch start

You’ll see reference to an elasticsearch.data.yml.template.. that’s super simple:

cluster.name: fauie.com.es5

node.name: INPUT_INSTANCE_NAME
node.master: false
node.data: true
node.ingest: true

node.attr.rack: AVAIL_ZONE
network.host: 0.0.0.0
http.port: 9200
cloud:
    aws:
        region:  REGION

discovery:
    zen.hosts_provider: ec2
    ec2:
        groups:  es5.in



script.inline: true
xpack.security.enabled: false
xpack.monitoring.enabled: true
xpack.watcher.enabled: false

Made up a security group, etc… configure the security group to whatever you’re using for your ES cluster.. change the bucket to your bucket.

Each ES host needs a unique name (beats me what will happen to elasticsearch if you have multiple nodes with the same name.. they’re geniuses.. it’s probably fine, but, you can test it, not me).  Alternatively, try to use your instance ID as your node name!

Then your user init data looks super stupid and simple:

#!/bin/bash
sudo su -
aws s3 cp   s3://notmybucket.com/provision.es5.sh   .
bash ./provision.es5.sh

Add user data

Once you complete your EC2 creation, you can verify the output in:

/var/log/cloud-init-output.log

# grep "Now run" /var/log/cloud-init-output.log
Now run service elasticsearch start

 

Ship it!

Every time I forget the mantra “F(orget) It, Ship It”, things don’t go well.  Analysis Paralysis.  Develop towards a stale goal.

Historically, projects get bogged down for ages making sure it’s “perfect”.

Face it, it’s never perfect.  Ever.

This applies to software, companies, features, church activities and anything else that might be new and untried before.  Analysis and rework is the killer of new ideas.

I build products for people to use.  I know the data that my products use.  I know some of the pain points I’m trying to solve for customers (current and future!).  It’s SO easy to say “oh dang, let’s just add this one more XYZ widget before we call MVP”.  It’s easier to add new features than it is to declare a product “good enough”.

OMG! HE SAID GOOD ENOUGH!

Yes, I did, and will again.  Nothing is ever perfect, and “good enough” is not a declaration on the quality/reliability/security of a new piece of software code.  It’s ‘good enough’ for someone to use.  This is why we strive for a minimally viable product, or MVP.

Counter that with the bad attitude: “good enough”.  That’s a statement on being lazy, not having professional quality standards and not giving a crap about what happens once something leaves your desk.  This is NOT what I’m advocating for.

Draw a line in the sand

Before you build, define your target. Define your MVP.  Define what is ‘good enough’ to your customer.   It can’t suck.  It has to add value.  It has to be easy (enough) to use.  It can’t be ugly, but it doesn’t have to be a work of art.  Ever see the first Google home page or the first version of Splunk?   Compare them to the current interfaces.  Good enough at work.

 

 

Quick Heimdall Data Install

Anyone played with Heimdall data’s software?  I was introduced a few weeks ago.  They’re a super early startup (love!) with a pretty cool technology.

The feature that I really latched onto was the invisible cacheing.  The first time I talked about a write-through cache was with nPulse tech, when dealing with some of the indexing we did, and there wasn’t really any easy technology to use..  typically its (pseudocode, python ish):

def my_function_by_id( id ):  
    out_object = check_redis_cache(id)
    if not out_object:
        out_object = db.execute("select * from my_table where id = %s", ( id, ))
        set_redis_cache(id, out_object)
    return out_object

 

What happens when you switch cache services.   Go from Redis (simple) to Hazelcast (complex)?

Wouldn’t it be better to just:

def my_function_by_id( id ):  

    out_object =  db.execute("select * from my_table where id = %s", ( id, ))

    return out_object

Yes, that didn’t save that much code.. but, how many places in your code do you interact with your database?

Enter in the Heimdall data system.      I can write my code, connect to their proxy (since I love python, I’ll use the proxy, but they DO have a JDBC driver.. update your config and you’re off to the races)

The software identifies my SQL statements, extracts patterns, extracts parameters, and automatically sets up the cache.

Let’s follow instructions from here:

 

Alright, let’s start from a fresh Centos 7 virtual machine

sudo su -
bash <(curl -s http://download.heimdalldata.com/downloads/serverinstall.sh)

Then, grabbed the service file from here:  https://rayed.com/wordpress/?p=1496

Put the file contents in /etc/rc.d/init.d/supervisord

chmod +x  /etc/rc.d/init.d/supervisord

sudo chkconfig --add supervisord

sudo chkconfig supervisord on

echo "SELINUX=permissive" > /etc/selinux/config

firewall-cmd --zone=public --add-port=8087/tcp --permanent

shutdown -r now  #Yeah, hack that SE Linux out of here

I had a little issue running the proxy on my VM, when it booted, it preferred, and only listened in the IPV6 addresses.  Quick fix to /etc/supervisor/conf.d/heimdallserver.conf

changed:

command=java -server  -jar /opt/heimdall/heimdallserver.jar

to

command=java -server -Djava.net.preferIPv4Stack=true  -jar /opt/heimdall/heimdallserver.jar

Reloaded supervisord

sudo service supervisord restart

waited a few minutes… tried my web browser and there it is!

Stay tuned.  In the near future, I’ll do a video walk through of these, and at least two more overall videos.

This is a good start for me to practice some on screen time.   More on that to come!  I can’t wait to share the big news.. you’ll hear LOTS more of me.

 

chris.

 

« Older posts Newer posts »

© 2020 Fauie Technology

Theme by Anders NorenUp ↑