Yesterday I stumbled across some code that contained a mistake a lot of Python developers get wrong. Let’s see why and how to fix it.
A few weeks back I have started contributing to the awesome Mythril project. Mythril is a security scanner for smart contracts that allows everyone to look for vulnerabilities on- and off-chain by being able to analyze raw smart contract code, as well as the actual Solidity code file. To make setting it up more easy, the devs provide a Docker container for easy deployment and use via docker run.
A friend of mine is following a PhD in a non-technical field. And his boss is a bully. Work mainly happens with high-level statistical analysis tools. No one knows anything about programming and most problems are solved by hand. While on a positive note this means good chances to get a student job, it also means that progress moves slowly, especially when it comes to working with large datasets.
I recently found myself in the situation where I was given access to a huge MySQL database that contained network traffic flows and IDS signature match data. As I work a lot with graph-based approaches, I needed to convert the table’s flow data into a graphml file for later visualization and analysis with scripts I have already written. Now without further ado here’s the code…
A few hours back I stumbled into a problem where I had to perform a lookahead of n elements in a list to do some calculations. The first thought: Just take the current index and get all elements until i+n. I started writing..
A few days back I stumbled across an interesting problem. I was asked to develop a solution that was doing some analysis work on geolocation data stored in KMZ format. Existing solutions like fastkml (64KB) and pykml (42KB) seemed nice at the first glance, proved to be unnecessary overhead, however. They’re mostly meant to manipulate and write data into KML format. I just needed to read the data for my later calculations. So I decided to build a solution using the Python Standard Library.
For some research on botnet host detection in large-scale networks, I found myself in the situation that I had to apply a set of algorithms to a huge packet dump. To comprehend an amazing paper, I started to play around with the dataset and tried to reproduce the results presented in the whitepaper. Quickly I realized that there was something fishy with my own dataset, so I fired up jupyter-notebook to gain some more insight in the IP structure of my dataset.
Nope, I’m not going to join the goto war. Even though it’s shunned among developers, there are still some situations where it makes sense. A good friend of mine with a background in C recently came to me with a very simple problem that still made him scratch his head when he tried to express it in Python. The problem broke down to comparing three lists to find an element that meets a special criterion. His basic naive concept looked something like this in pseudo-code…
We all have that special someone in our life. Someone who dares to commit and push dirty code into the master branch. Let’s see how we can force people to learn using Git hooks!