With most websites now using some form of client side analytics, we do not tend to parse and analyse Apache log files automatically any more. However for people managing web-servers they are still our first port of call when troubleshooting issues. To help quickly working out what is happening I wrote a small command line tool to parse and provide statistics out of Apache log files.
Processing user requests in the backend component of websites is something that feels like a mostly solved problem. Doing the same in frontend web applications is on the other hand something that is still being improved on all the time. Here I look at how managing state differs in the two environment, and how this affects application architecture.
While there are many guides on setting up Apache Cordova for Android development on Ubuntu, none seemed to cover all the issues I encountered - it took a combination of resources to get there. I decided to write how I went about setting this up, and the issues I encountered along the way.
Low bandwidth implies high latency - when implementing a site for low bandwidth conditions we must pay attention to the packaging and number of assets we include on each page. Here we look at an approach for combining assets in Django while preserving the relationship between each asset and the code that required it.
CKAN, an open source data portal platform, provides an API for fetching everything from datasets to individual records. Here we look at how CKAN's architecture allows developers to transparently re-implement the datastore API, and how this was used to improve performance by switching all searches to using a Solr backend.
Docker is a tool to help in the deployment of applications across host systems. Virtualization, union file systems, image registries, orchestration services - while Docker is a useful tool for staging and deployment, there is a learning curve to get to grips with the whole ecosystem.
Docker is extremely useful for deploying packages, but also for creating environments for running functional tests. Here I show how to easily setup a cluster of PostgreSQL servers in slave/master replication using Docker and drive them using the Docker Python API.
Node is a great platform when it comes to handling and dispatching numerous concurrent requests. Here I explain how I implemented a proxy which limits the number of concurrent requests per client, and prioritizes the queue of request using user-defined criteria.
While the Python ecosystem is replete with libraries for fetching data over the web, none of them give you an easy way to interrupt requests before the queried server has returned the response headers. As more often than not servers will only output response headers once they have the full response at hand, this does not make it possible to release resources early. Here I show a possible implementation based on sockets and Httplib.
CKAN implements a permission system based on roles, permissions and authorization functions which can be overridden by plugins. I used to this to implement the ckanext-userdatasets plugin. The aim of this plugin is to allow certain types of users to create datasets in an organization without having the permission to edit or delete other users' datasets. Here I describe how I went about implementing this plugin.