Tuesday 23 July 2013

Beautiful splotting

Recently, I've been doing rather a lot of R-matrix analysis. In short this means I'm trying to understand the probability of nuclear reactions occurring as a function of both energy and scattering angle. The gnuplot splot command is great for viewing the calculations and data as a function of two variables, and makes fitting the data much easier.

However, with many data points, depth perception can get a little tricky. For example, a typical plot might look something like this:


In this case, the agreement between the curve and the points isn't fantastic, but it simply isn't clear which point should be compared to which line.  You can't easily distinguish both the x and y coordinates of each point.  Even interactively rotating the plot doesn't help much when the point density is this high.

The tip here is to colour the points and calculations to help depth perception.  Adding the palette option to the splot command will colour by a fourth variable that is appended to the column specification.  Here's the sample code:

set terminal png enhanced size 800,600 font "sans, 16"

set xlabel "Energy (MeV)"
set ylabel "Angle (degrees)" offset 0,-1,0
set zlabel "Cross section (mb)" rotate by 90

set palette model RGB defined ( 0.0 'black', 0.3 'purple', 0.5 "blue", 0.7 "green", 0.9 "orange", 1.0 "red" )

set output "splotting.png"  
splot "data.dat" u 1:3:6:3 w p ps 1 palette notitle,\
      "data.dat" u 1:3:4:3 w l lw 2 palette notitle

In this case, all the data is in data.dat.  Columns 1 and 3 contain the energy and angle respectively.  Column 4 and 6 contain the data points and the calculations.  The fourth specifier (3) tells gnuplot to colour things by the angle - in this case, the y-axis variable.  The resulting plot looks like this:


It is now clear which lines should be compared to which points.  The palette option certainly makes for prettier splots and can be really useful for improving their readability. The range of colours can be tuned in each case using the set palette model command to match the spread of your data to make things as clear as possible.

Tuesday 30 April 2013

Bar Charts

I wanted to make a bar chart for a recent post on my real blog (on the difference between different citation databases for my recent papers).

I needed to look up enough of the details of how to do it in gnuplot, that I thought I'd document it here for my own use, and in the hope that others may get some help from it.  I pass my thanks to the gnuplotting blog for some useful tips to help me get started, and the nice idea that having less prominent grids and tick labels makes things easier to read.

The figure is above, and the necessary code follows, with commentary below:
set term 'pngcairo' font 'Helvetica,12'
set out 'cites.png'

set style line 1 lc rgb '#440000' lt 1
set style line 2 lc rgb '#882200' lt 1
set style line 3 lc rgb '#bb6622' lt 1
set style line 4 lc rgb '#ffcc66' lt 1

set style line 11 lc rgb '#808080' lt 1
set border 3 back ls 11
set tics nomirror
set xtics 1,1.0,25.0 offset -1.5,0.0
set style line 12 lc rgb '#808080' lt 0 lw 1
set grid back ls 12

set style fill solid 1.0 border rgb 'grey30'

set ylabel 'citations'
set xlabel 'Paper Index'
set xrange [0:25]

bw=0.15
plot 'cite-data' u ($0+0.5-1.5*bw):1:(bw) w boxes ls 1 t 'Google Scholar', \
     'cite-data' u ($0+0.5-0.5*bw):2:(bw) w boxes ls 2 t 'Scopus', \
     'cite-data' u ($0+0.5+0.5*bw):3:(bw) w boxes ls 3 t 'ISI', \
     'cite-data' u ($0+0.5+1.5*bw):4:(bw) w boxes ls 4 t 'Journal site'
I used the pngcairo terminal, in the belief that it is better than the png terminal, but I should really investigate that.

The block of "set style line" commands is used just to set the colour of the filling of the bars in the bar chart.  If you follow the link to the blog post from my other blog that I mentioned above, you'll see that I didn't have those colour-changing lines, and used the default colours instead.  I'm not sure which I prefer, but this at least shows how you can change them.

The next block sets up the border, tick marks and labels and the grid.  Line styles 11 and 12 are used to set the border and grid to be grey rather than black, to accentuate the actual data.  I used the set tics command to define user tick marks.  I set up the data file to have no abscissa column, so it will be plotting it just by the ordinal number, and a little playing with the offset enabled me to get the numbers in the right place.

The set style fill line is needed to fill in the bars.  If you don't set this, you'll just have box outlines, so if that's what you want, don't have this line.  You can specify the border to be a different colour from the fill, as indicated.

The last section sets up the bars.  I have four bars to show per ordinate.  I decided to make each one take up a fraction of 0.15 the width of the space available for each data point.  The bw=0.15 sets up a variable to store this.  In the plot command, the $0 specifies the ordinate for each point, since I don't provide x-value points in the cite-data file.  The calculations such as $0+0.5-1.5*bw specify the centre of the point at which the bar appears, and the (bw) the width.

If you want to play with this, here is the file cite-data referred to in the gnuplot script.

Tuesday 1 January 2013

Fourier series, conditions and recursive functions

This script uses recursive functions to plot a square pulse function and its Fourier series expansion, truncated to some maximum Nmax.  The terms of the series itself must be found analytically, but once found this script allows one to visualize the convergence of the series.  The resulting plot is shown on the right.

The result of the Fourier series is found by summing over the terms of the series, each of which have a simple analytic dependence on the summation variable.  To do this, the gnuplot script makes use of the Ternary operator ?, which can be used for conditional assignment.  The syntax for this can be described as:
y(x) = (cond) ? [true] : [false]
The function y(x) takes the value [true] if condition (cond) is true, and [false] if condition (cond) is false.  The Ternary operator itself is ? and a colon : is used to separate the possible results [true] and [false].  Both [true] and [false] may be further conditional statements, meaning the statements can be nested, e.g.,
y(x) = (cond1) ? (cond2) ? [true2] : [false2] : [false1]
If (cond1) is true, then the second condition (cond2) is tested.  If both are true, y(x) takes the value [true2].  If (cond1) is true but (cond2) is false, y(x) has the value [false2].  Finally if (cond1) is false y(x) takes the value [false1].

The step function for the present example is defined as:
f(x) = -2 (-π≤x<0)
     =  1 (0≤x<+π)
and is periodic with period 2π.  The terms of the Fourier series can be evaluated analytically.  The gnuplot code to produce the plot is:
set terminal png enhanced size 800,600 font "sans, 18"
set output "fourier.png"

set tics nomirror
set border 2
set xzeroaxis lt -1 lw 2
set xtics axis

pi = 3.14159265359

# f(x) is the step function that is being expanded
f(x) = (x<-pi) ? f(x+2*pi) : (x>pi) ? f(x-2*pi) : (x<0) ? -2 : 1

# g(n,x) is the nth term in the Fourier series
g(n,x) = (n==0) ? -0.5 : (1-(-1)**n)*3*sin(n*x)/(n*pi)

# s(n,x) is the Fourier series up to the nth term
s(n,x) = (n>=0) ? g(n,x) + s(n-1,x) : 0

set yrange[-3.5:2.5]
set xlabel "x"
set ylabel "f(x)"

set samples 400

# set the maximum term in the series
nmax1=3
nmax2=10

plot f(x) title "Step pulse",\
     s(nmax1,x) title "Fourier Series, N_{max}=".nmax1,\
     s(nmax2,x) title "Fourier Series, N_{max}=".nmax2
The first two lines set the png terminal options and the output file name.  f(x) is the step function itself.  The first two conditions (x<-pi) and (x>pi) move the value of x into the range -π to +π, and the last condition (x<0) gives the function the value -2 or 1. g(n,x) is the nth term in the Fourier series, determined analytically, and s(n,x) gives the full Fourier series up to a maximum value of n, nmax. The condition in s(n,x) makes sure the recursion stops when n=0.  The script will plot the exact step function and the Fourier series truncated to maximum order nmax1 and nmax2.  By varying nmax, the convergence of the series can be seen.

There are a couple other neat tricks in there too.  The second group of set commands remove the normal borders and place the x-axis at y=0, and in the last line the string concatenation operator "." is used to append the gnuplot variable nmax onto the title for the Fourier series.