When I first started web development professionally, queries were written in-line and DBA’s complained about developer SQL performance. This was at the end of an age. Injection attacks were already hitting some of the large websites, and it became clear that Parameterization was the simplest way to counter it. Answers to the question of how best to check and enforce parameterization became relevant. PDO and MySQLi libraries were extremely relevant for PHP, and most of the large open source, and even paid platforms, had a hard time adjusting to the new approach. These days, the answers are pretty cut and dried, but, what if they are wrong? Here are a few approaches to parameterization – and an introduction of my personal favorite for microservices.
At the end of 2011 I started a new job. I also found out about a new thing. A query builder. It essentially had each operation from MSSQL mapped into a PHP object – and – you could build a query by referencing one of each of the major pieces of a command. Sub-selection was particularly interesting. To go along with the query builder was a massive quite of tests to make sure all variations of the builder worked properly, and, a large instructional document on how exactly to use the builder. The one I worked on was quite proprietary, but, the link above is somewhat close – although not as refined approach.
The SQL builder was actually pretty decent at parameterizing queries. The queries themselves weren’t too hard to read. After some tweaking – it was also fairly straightforward to dump out the ‘built’ query, including the params. Well, I thought so anyhow…
The learning curve was steep. A new developer was learning SQL and QueryBuilder at the same time. And even veterans preferred to call a raw string because they couldn’t remember where the document was, or, the were under a time crunch and they just needed it to work. Troubleshooting them was also a pain, having ‘comments’ in queries helped trace things when a DBA asked about the query, but, if the logger callbacks weren’t set quite right, the stack trace would end in the middle of the object builder.
Object-Relational Mapping (ORM)
The ORM is largely regarded by many developers I’ve met as the ultimate and final solution to talking to the database. The ORM model really abstracts away from the query. Now actions are objects, and parameters are obvious and easily manipulated. There’s documentation on how to use, naturally, and there’s a raw option in case something isn’t supported. You never know, pivots are weird! They are the all encompassing framework model building approach.
It’s object oriented! You don’t have to worry about the query syntax! You can even use it to enforce schema!~Various Developers
The ORM really does a great job of covering parameterization. In fact, they often have a ‘query builder’ built into them, as it is a necessary piece of understanding what the ORM is doing. They’ve also got a lot of industry backing. Major players in the PHP space have really built them out. On top of this, they are pretty solid and don’t change much. They can also provide developer understanding of the database. Or, they can cloud it. The queries generated will always be consistent. That’s a good thing – until it isn’t.
ORMs are complicated. The concept is simple enough – even the documentation. That said. The DBA, or DB performance monitor just complained about a query – something that came out of your code base. Good luck identifying it quickly. Figuring out how to get a performant query can also sometimes be annoying. Forget migrating queries. That just doesn’t work.
ORMs are heavy. Building models and defining the entire database in code is a duplication of the DB itself. It’s a lot of putz work, and even if auto-generated, it’s still a significant effort to maintain an ORM.
ORMs still have a raw option. For all their splendor, ORMs still have the same flaw as a straight-up in-line sql query. raw. Any developer can potential get one of these going, and, voila, injection attack vector alert.
I’ve always had a desire for clean, intuitive, simple code. I’ve found that ORM’s and Builders are clean, but they don’t typically fit into simple, or even intuitive code. For that reason – I decided it was time to provide something that answers all three, and still remains true to security. I started off creating a query builder. Then realized it wouldn’t provide what I wanted. I scrapped that. In its place, I create a query validator. It validates a basic query, checking for parameters, and identifying if there should be parameters – it will fail without. I also added an Object Oriented approach for ‘sub’ queries. Recursive validation and what not.
Parameters are cool, in queries they’re sweet.
ORM’s bite – this new library is a treat!
Legible queries are easily written and read – I don’t need to build a huge empire of code to contact the database. I can just talk to the Database, even using specific query options of different databases. most of all – raw IS almost entirely the string written! With the validation piece too, it mitigates the risk of someone entering SQL strings without proper parameters.
I do kind of miss the data model in the code base. I also am concerned about query duplication in various code places. That is mitigated by a microservice architecture – but the potential still exists. This approach is also something that a large development team might eschew, as it just isn’t industry standard.
I think there are appropriate tools for appropriate scenarios. I didn’t really cover in-line queries – because they seem to fit the bill when the others fail. To that end, I’ve prepared something of a table as to where this could translate in real-world applications. Naturally, I am a bit biased.
Feel free to comment!