Better analytics for Folding@Home…

… were needed IMO so, I decided to build my own and they’re now at it’s not a nice interface at the moment but the data is being collected in the background 🙂

The home page shows the top 10 teams for score and also for work units, the ten trending (not worked on this algorithm yet) and the ten newest teams with non zero scores/WU. Clicking on a teams name displays the teams info and some historic values.

Here’s how it works, 4 times a day data is pulled from FAH, then added to the database. New rankings for score and work units are then calculated. Once a day a history entry is generated and then in the background in batches (of 500 records) the score differences and that are calculated… basically lots of scripts running in the background, triggered by cron, and theres a table recording time, script and time taken and also if the script gets to the end or times out.

Things I need to do:

  • A nice interface
  • fix the efficiency of the differencing script…. it’s a massive fail atm. (indexing)
  • Need to add some basic site stats, days, records, size… queries per day and database use?
  • I will be adding some graphs in the future,
  • Signature thingy.

Oh and then I found:

edit {hack} a .deb package to remove user options – silent install.

So, with my new Dell M1000e and M600 blades now sitting in their new home in my workshop and having tested all the blades are working (-1) I now need to load test them. I could just use a program designed for this, but i’m thinking that as i’ll be burning through a chunk of electricity i’ll put the compute cycles to good use (the heat will be keeping me warm). So i’m going to do some Folding at Home. If you don’t know about this, then you should check out this great project.

My team id is 232280. Feel free to join me.

Anyway, i’ll be running Ubuntu on the blades for a while (when I can get the latest version installing from USB), so I played around with installing inside a VM. The instructions for this can be found but take them with a pinch of salt as a few links are broken/need fixing (i’m letting them know on the forum). The installer asks for user details as part of the package and you can’t do an ‘unattended install’ which is what I was after (for a very specific reason). NVM, i’ll just figure out how to disassemble the .deb package and change it.

I haven’t done this before so I started not know that it was possible, but it’s actually quite simple. I found a post explaining how to open and then recreate the .deb package.

mkdir tmp 
dpkg-deb -R original.deb tmp 
# edit DEBIAN/postinst 
dpkg-deb -b tmp fixed.deb 

…and then had a bit of a read up of what I found inside – read this, sec 7.6. After a bit of head scratching and a few edits that didn’t work I finally worked out what I needed to do to remove the user interaction. (some of this helped.) Rename the templates file and comment out the lines referencing db_get then add in the variables for user, team, passkey, power and autostart. (One thing I would like to do is add some sort of reference to different machines MAC address in the user but i’ll look into that later.) After doing this i’ve hacked the .deb into something that I can script silently – which means remote deployment :D.

I also came across while wondering the web – it’s really well written for beginners like me.

You can find my hacked .deb here. My team id is baked in. Install using:

sudo dpkg -i --force-depends FAHinstaller.deb

ver 2 now includes the mac address in the username so i can see how much each blade will have done.

The DHT11 temperature-humidity sensors pt 2

Here’s the graphs from my recent experiment, and here’s the data.

Temperature from 11 DHT11 sensors over 22900 data readings for each sensor

Temperature from 11 DHT11 sensors over 22900 data readings for each sensor

Humidity from 11 DHT11 sensors over 22900 data readings for each sensor

Humidity from 11 DHT11 sensors over 22900 data readings for each sensor

min, avg, max and variance of temperature from 11 DHT11 sensors over 22900 data readings for each sensor

min, avg, max and variance of temperature from 11 DHT11 sensors over 22900 data readings for each sensor

min, avg, max and variance of humidity from 11 DHT11 sensors over 22900 data readings for each sensor

min, avg, max and variance of humidity from 11 DHT11 sensors over 22900 data readings for each sensor

and a brief video…

The DHT11 temperature-humidity sensors

So I have loads of these from an old project that you guessed it – I never got round to, but how accurate are these cheapest of cheap things?

Set up an Arduino/RasPi with all the sensors and get it to read them all back – if all in one location then they should all be the same right? Lets see how accurate they are! I’ll run the test for a week (probably longer) with various conditions and then report back.

oh and some code….

// origional code from ladyada, public domain
#include "DHT.h"

DHT dht0(2, DHT11);
DHT dht1(3, DHT11);
DHT dht2(5, DHT11);
DHT dht3(6, DHT11);
DHT dht4(7, DHT11);
DHT dht5(8, DHT11);
DHT dht6(9, DHT11);
DHT dht7(A0, DHT11);
DHT dht8(A1, DHT11);
DHT dht9(A2, DHT11);
//older sensors
DHT dht10(A4, DHT11);
DHT dht11(A5, DHT11);

float temps[12];
float hmids[12];

void setup() {
Serial.println("DHT test!");

void loop() {
// Wait a few seconds between measurements.

hmids[0] = dht0.readHumidity();
temps[0]= dht0.readTemperature();

hmids[1] = dht1.readHumidity();
temps[1]= dht1.readTemperature();

hmids[2] = dht2.readHumidity();
temps[2]= dht2.readTemperature();

hmids[3] = dht3.readHumidity();
temps[3]= dht3.readTemperature();

hmids[4] = dht4.readHumidity();
temps[4]= dht4.readTemperature();

hmids[5] = dht5.readHumidity();
temps[5]= dht5.readTemperature();

hmids[6] = dht6.readHumidity();
temps[6]= dht6.readTemperature();

hmids[7] = dht7.readHumidity();
temps[7]= dht7.readTemperature();

hmids[8] = dht8.readHumidity();
temps[8]= dht8.readTemperature();

hmids[9] = dht9.readHumidity();
temps[9]= dht9.readTemperature();

hmids[10] = dht10.readHumidity();
temps[10]= dht10.readTemperature();

hmids[11] = dht11.readHumidity();
temps[11]= dht11.readTemperature();

Serial.print("Temps ");
for (int i=0;i<12;i++)
Serial.print(", ");

Serial.print("humids ");
for (int i=0;i<12;i++)
Serial.print(", ");

//write data to SD card
File dataFile ="dhtdata.txt", FILE_WRITE);
if (dataFile) {
dataFile.print("Temps, ");
for(int i=0;i<11;i++)
dataFile.print(", ");

dataFile.print("Humids, ");
for(int i=0;i<11;i++)
dataFile.print(", ");



How much of a table is in use?

I’ve been working on a web based monitoring tool for a while now and it’s getting close to completion. A thought crossed my mind, and be prepared it’s a wild one.

WHAT IF: my site get’s so popular I hit the upper limits of the index or field size of the primary keys of the tables?

Well clearly – it’d break and then i’ld get am email or something telling me to fix it and I would go and change the field types to allow more… problem solved…. but it’s not very nice! and it means downtime! (ironically not something I want for a tool designed to measure downtime!)

Anyway – I’m building in monthly, weekly and daily email routines to let me know certain bits of information regularly without me having to log in, and I thought why not add something that tells me how much I have used? So I did.

Here’s my solution (but not my code):

get a list of tables,
for each table,
get table name,
get primary key,
get field type and convert to human friendly numbers,
get number of rows in use,
work out usage in %,

Half an hour using google and a bit of common sense I now have a page that shows me the stats of the primary key usage and now I can add them to the email routines.

RGB colour correction

So I’ve been playing with different colour correction techniques for my LED’s.

So here’s a simple one. Square it. Well ok, square it then divide through by the upper value (square and scale.) So:
unsigned int value = (i*i)/255;

which sort of works. But it’s a little jumpy in the lower regions, as the mapping looks like,
(23,27)->2, ect….

The next I’m trying is something called Quadratic interpolation using a Lagrange Polynomial. BIG words for a curve of best fit. It works by generating a polynomial through a number of pre determined points. I have worked it out using 1->1, 255->255, and then 128->c. I can then change c, effectively changing the correction to the colour. Only problem is that it gets a little complex. So the maths formula is;

(-cx^2 +256cx -255c+128x^2 -16639x+32640)/(16129)

I attempted and failed to implement this as Arduino isn’t really the right platform for doing this kind of arithmetic. It works in excel ‘tho and produces a lovely graph.

mapping curves

A graph showing various possible curves

The Black curve in the graph is actually the square and scale method above. The other are for various choices of c.

So I quickly mapped the first 32 values i->x to the 32 LEDs I have and they are quite linear, from there on I found every bank to be quite similar. Which led me to the conclusion that for small values a one to one correspondence was ideal, then a kind of s curve would be ideal. I tried a few cubics but they were quite complex to implement and even tried a quartic.

So back to a basic square and scale I think for now.

Hyper-V and Vlans

So I’ve now got 2 identical hosts running server 2012 datacenter, with VM’s on both. Now it’s time to get them talking to one and other. Simple right, just sit them all on ‘external’ virtual switches and they can use the physical network to talk to each other. Great except… I don’t really want my test environment hooked upto my live environment, not do I want to change my existing DHCP server settings or have two on the same broadcast domain!

So, Hyper V can use VLANs to segregate traffic. This is the way forward for me.

  1. 1 create a new VLAN for the test environment,
  2. tag the ports for the VLAN,
  3. put all VMs on the new VLAN…
  4. oh and change the management VLAN in the virtual switch manager for the hosts…

Groovey now I can talk to the host over my LAN (on VLAN 1) and the VMs can talk to each other on VLAN 200, but they as VLANs are completely segregated in terms of broadcast domains (for DHCP) and packets. Now I can set up a (virtual) router to bridge between them. PFSENCE time.