Field Theory

When I got into the PhD program at IFISC at the end of 2017 I was told I was going to work on Twitter data. I was fresh of graduation with a Master thesis on opinion dynamics of political fake news in the US, however, the Twitter data stuff was slightly different. My initial financial support was a project on human mobility so we would need to do something on this topic. My first supervisor Jose told me there was a Master student, Alex, who just defended his thesis on using gravity laws to explain commuting in big cities. The work was awesome, but we needed to go deeper to make it publishable. He asked me if I was interested, I read Alex’s thesis, I was in. It was not what I was expect to work on, but I was sure I was going to learn a lot from this. It goes without saying, the first approach what tough, days and evenings spent on the calculations to check if the theoretical framework was sound, but that was just the first step. The data analysis was exciting and I can still say it’s the part I like the most about my work, even more than writing the paper or making the plots.

However, this yelds lots of days working hard without finding a solution nor advancing. The biggest obstacle was the surface and volume integrals and the integration of the vector field in 2D on the grid. In programming exams at Physics Faculty they teach you a few examples on how to integrate a given curve with trapezes or approximated squares. You sum up the areas and that’s it. But here the problem was different: I had to compute the flux of the field through a circle (or square) of different sizes around a point. And most of all, discretize operators as the curl, the divergence and integrate a vector field on a grid. Who the hell did this before? “Come on, it’s 2017, a lot of people must have done this before”, I thought. I was wrong. At the beginning of my PhD there was a PostDoc in the Lab, Riccardo, who told me that in his PhD he learned two things “keep it simple” and “if you have two ways to do something, try both”. I had none, or many, i didn’t know, but still I needed to keep it simple. The scalar product among the vector field and the surface normal vector was the easy thing, the problem was the parametrization. If you want to get the surface integral of a field you need to parametrize your circle. Once you fix the radius R, the integration variable is the angle, but how many steps dθ should you take? 100? 200? 1000? In the analytical approach, like the flux of a field generated by a point charge, this issue gets fixed pretty easily when you realize you can just try to reproduce the results of a few known integrals. By doing this you check how many steps are enough to get a good accuracy. In the empirical approach you only have vectors placed in a grid and a circle. What do you take as infinitesimal angle dθ? You cannot say “I take 200 steps” when you only have to integrate 50 vectors crossing the surface. The diameter of the cells could be the easiest choice, but it’s not always a good approximation, the dθ may vary depending on how the circle crosses the cell. So basically I decided to be more precise but not to die in the process: I drew the circles I needed, I counted the number of cells N crossed by the cirlce and that did the job. This gives an average dθ for each radius, N is the number of cells you have to go through to sum up the total flux for the given radius R.

The other side of the Gauss’ Theorem is the volume integral. On this side the parametrization was easy, you only have to sum up the areas of the cells enclosed by the volume. But what about the divergence? Discretize operators was basically the hardest part of the job. This was the same for the calculation of the curl and of course for the integration of the empirical and modeled vector fields. I began to search everywhere on the internet how to compute the divergence, the curl and the potential of a vector field in a 2D lattice. I have to say the results where not many. Among these, the understandable ones where few. But at some point I bumped into Hyman, J. M. & Shashkov, M. Natural discretizations for the divergence, gradient, and curl on logically rectangular grids, Ref. 51 of the paper. This saved my life. When I saw the Gauss Theorem emerging from Figure 2 I felt save. The curl come out right after this, as shown in Figure 3, this was a clear sign. Half of the job was done. But it was not the end. The field was well behaved and irrotational, we were sure we could get a potential out of it. But who did this before? Lots of people, ok. So where is the algorithm to do this? I didn’t want to spend other three months trying to find a solution. We had a theoretical framework, but we didn’t have an algorithm in order to get the potential out of our data. Also, the theoretical framework was meant for a continuous space, while we were dealing with a discretized one. It didn’t make much sense to compare two different things and we were trying to do exactly this. The model and the data had to be integrated using the same discrete method. The method I was using was pretty stupid, but it worked. When you deal with a constant field and you want to get the potential, the integration reduces to multiplying the field vectorial components for the distance. The potential is invariant under summing constants, so you can choose a zero potential point wherever you want and start counting from there. I assumed the field was zero on the borders of the grid and constant inside each cell cause I didn’t have more detailed information on it, so who cares? I had two dimensions, so I had to do the same operation on both axes, possibly at the same time to avoid bad surprises. Parametrize the grid, start from a corner and keep summing up through all the grid. Still, it was not enough. The data is data and since it’s data, the empirical calculation of whatever cannot be perfect. There was noise. Without noise we could just build a system of Differential Equations and solve it. (I tried indeed and I broke the memory of the lab machine). Of course it could not work. Basically we were getting the potential, clearly peaked in the center of the cities, but there was a kind of tail. It was like a shadow produced by a light placed at the corner where I chose to start the recursive integration from. Behind the peak, in the shadow, we were not seeing the correct value of the potential. I showed it to my supervisors. The analogy was pretty accurate. So accurate that we decided to solve it in the same way: put lamps on the other three corners as well. I reparametrized the grid other three times, and I started the integration from the other three corners. So now we had the lights of a football field and if you average the results you get a consistent image of the empirical potential. That was it.

All this meant months of work without advance. This is the most stressful problem of my work, basically I spend the 80% of my work days without advancing and this brings me to wonder what am I doing with my life. I guess this is the same for all the PhD students and the majority of the researchers. A part from this, my PhD program started filling up with other projects. Some of them have been uploaded to arXiv, others are still stuck in the darkness. After two years from the first day of work, the paper came out and it made sense of all of those days I went home frustrated.

2019 - Field theory for recurrent mobility

I still hope nobody makes tricky technical questions when I give talks on this at conferences.

Updated: