When your research gets really computational, head for W&M's giant abacus
Nathaniel Throckmorton was ruminating on the zero lower bound and had reached a point at which he needed William & Mary’s giant abacus.
The zero lower bound, or ZLB, is the state at which short-term interest rates offered by the U.S. Federal Reserve or some other central bank hit or come near to rock bottom — zero. Throckmorton and some colleagues at the Federal Reserve Bank of Dallas were working with a ZLB model, and the model came with a conundrum.
“There are two approaches in the literature right now to deal with this particular model,” he explained. “It’s a single model, but there are two ways of connecting that model to data.”
Throckmorton is an assistant professor in William & Mary’s Department of Economics. He said there is a clear distinction between the two ways of making the model-data connection. The first is quick and cheap, in terms of computation.
“But the trade-off is that when you have a cheap way of doing something, it turns out that you often lose the quality that you find in a more thorough, more accurate, more costly procedure,” he explained.
“Our big question is this: Is the cheaper method worth it? In other words, does the cheaper method miss anything important about the macroeconomic effects of central bank interest rate policy regarding the ZLB?” Throckmorton said. “I came to Eric. He has the giant abacus.”
Eric is Eric Walter, manager of the High Performance Computing group in William & Mary’s Office of Information Technology. Walter is the proprietor of the university’s array of supercomputers and computer clusters that do much of the heavy lifting when it comes to big-data research problems.
Today’s HPC cluster contains nearly 10,000 processor cores and is housed in the new wing of the Integrated Science Center. Another cluster, Chesapeake, is located at William & Mary’s Virginia Institute of Marine Science campus at Gloucester Point. The clusters are interoperable, and have a theoretical peak performance of 360 teraflops — 360 trillion floating point operations per second. When Walter talks to laypeople about computing power, he likes to use the laptop as a unit of measurement. The big abacus, he says, has the computational might of more than 10,000 laptops.
Walter’s HPC clients more often include people from the Virginia Institute of Marine Science who are modeling sea-level rise and other climate-change phenomena, materials scientists and physicists. But, he says, economists are welcome — as is anyone who has a need for cluster time.
Walter said users of the HPC fall into three categories: “There are people who know what they want to do, know how to use the stuff and just say get out of my way.
“Then there are people who know what they want to do and how to do it, but not here,” he continued. “And then there’s the more inexperienced users, who come in and say I don’t know how to do this. Then we’ll explain. And they’ll say, well, I still don’t know how to do it.”
Walter goes on to explain that the HPC group serves the needs of each group through a very active ticket system for questions and by conducting regular one-on-one appointments and presentations on the basics of using the cluster.
Throckmorton, Walter said, is a member of the second group. Throckmorton had extensive experience coding his work in MATLAB, but as he was adding more and more lines of code for a paper on the ZLB, he sought out Walter.
“In my field — macroeconomics — it's common for people to use MATLAB. It's also quite common for engineers to use it,” Throckmorton said. “It's a scripted language, so you're going through it line by line and doing things in sequence. It's not compiled.”
Sequential, or serial, processing is like a freight train: One car follows another, clickety-clack, clickety-clack all down the line. It’s not a problem for a train of moderate length, but the train can only run so fast, and Throckmorton was up in the cab of the ZLB Special, which was marked “express.” It was a long, long train, and he didn’t need the silicon equivalent of clickety-clacks.
“For this paper, I will be waiting weeks to get my results if I was using my MATLAB code as it was,” he said.
The solution was to convert Throckmorton’s work so it could run in parallel. If a railroad could parallelize, it could easily create multiple parallel sets of tracks and divide up the cars so that many, much-shorter strings of cars could run simultaneously. The running time is greatly reduced.
Working with Walter, Throckmorton was able to parallelize his code, converting it to a full FORTRAN implementation. Walter stressed that there is nothing inherently wrong with MATLAB, but languages such as C or FORTRAN facilitate parallelization more readily.
“When you parallelize code, you have to rethink the algorithm,” he said. “Instead of a line-by-line, step-by-step algorithm, you find ways to split it up so that multiple steps can happen simultaneously.”
Throckmorton had enough coding chops to do most of the parallelization conversion himself, once Walter got him pointed in the right direction: “I remember Eric being very helpful with the parallelization part,” he said. “I knew conceptually how to parallelize my code, but for the actual implementation, Eric gave me quite a few examples.”
Not all HPC users are as code-savvy, and Walter’s team lends assistance when needed. Walter said one HPC staffer, Jay Kanukurthy, had been working extensively with another faculty member, rewriting a serial-coded project that would run for perhaps 85 days.
“So Jay is basically chopping up the code into little chunks and getting it to work. We can blast 100 cores with it, and it will make progress much quicker,” Walter said.
When Throckmorton talks about a weeks-long wait for a serial-coded job to run and Walter mentions another job that would process for 85 days, they’re not exaggerating, or at least not by much.
“And we just can’t do that here,” Walter said.
One reason they just can’t do that there is based on the premise that all jobs are not created equal. There are priority clients and priority jobs. There are jobs that need to be run straight through, and there are jobs that don’t. An 85-day run is sure to cause traffic jams. Throckmorton’s nicely paralleled work hasn’t caused any problems.
“Nate's jobs are easily restartable from wherever they stop,” Walter said. “A lot of people's jobs are, but some people's aren't.”
Throckmorton’s ZLB job was a pretty heavy user of the HPC clusters. He was using so much computation time, that the economics department and Janice Zeman, dean of undergraduate studies, chipped in to buy five additional nodes for the HPC.
Walter pulled the numbers, noting that Throckmorton used 3,701,948 core-hours of the HPC general resources. “General resources,” refer to all computing assets except for the clusters reserved for high-energy physics jobs. It comes out to 2% of the general HPC resources in 2016, 17% in 2017, 21% in 2018 and 0.3% so far in 2019.
“This is more than 6.5 million core-hours since 2016. A core-hour is one processor core running for one hour,” Walter explained. “Therefore, a four-core laptop, assuming the same core speed, could do these calculations in 1.6 million hours — around 185 years.”
Macroeconomics papers such as Throckmorton’s are computation-heavy and equation-filled. He makes a wry, rueful expression when someone points out that economics is not usually listed among the STEM disciplines.
“Some economics departments across the country are classified within STEM,” he said. “It's a classification that has to be approved at the state level. In Virginia, none of the state's schools have an economics department classified as STEM.”
Basic economic principles involve simple math, he said, citing the example of a simple supply and demand model. A common graphing calculator is all that’s needed to determine the intersection where supply equals demand.
“Now, imagine a more complicated model with maybe three related markets, each with a separate supply and demand curve, and you're trying to find that equilibrium point,” Throckmorton continued. “You can imagine trying to figure that out with a graphing calculator.”
He said his department, and other economists at Virginia universities, are discussing how to get their discipline classified as STEM by the State Council of Higher Education for Virginia (SCHEV). STEM classification makes a difference on a number of practical points.
Throckmorton said one of his students graduated last year and got a job as a research assistant at the Kansas City Federal Reserve.
“He's an international student,” Throckmorton explained. “He also majored in math, so he can work in the United States for three years after college because he has a STEM degree.”