Saturday, December 8, 2012

My Personal Open-Source Story

Most people who know me also know that I am a huge proponent of open-source software and open science in general. I have many good reasons for this and I may blog about general reasons for open source/open science in a future post. What most may not know about, though, is my very personal connection to open-source research codes: If it weren't for open source, my career would probably have taken a very different path and, quite likely, I would not be at Caltech, working with one of the largest and most productive numerical relativity/relativistic astrophysics group in the World.

Here goes the story:

It all begins in the fall of 2001.  I had just arrived at the University of Arizona as an exchange grad student (I was actually in my fourth-year of undergrad at Heidelberg; but that put me on the level of a first year grad at Arizona back then). Back in Heidelberg, I had just taken GR and also participated in a seminar on theoretical astrophysics to which I contributed a review of milliseconds pulsars. This got me really interested in neutron stars and their formation. While at UofA, I wanted to do some research on this with one of the professors participating in UofA's Theoretical Astrophysics program.

So I walked into Adam Burrows's office (Adam moved to Princeton in 2008). Adam had just gotten a big Department of Energy SciDAC grant to develop the next generation of multi-dimensional supernova simulations and he had three grad students, Todd ThompsonCasey Meakin and Jeremiah Murphy, already working with him. I became his fourth supernova student and my task was to look into GR aspects of the supernova problem and, in particular, into gravitational wave emission. Adam wanted to do 3D hydro simulations of collapsing stars, but we did not have a code to do this. So one of the first things Casey, Jeremiah, and I were tasked with was to explore various publicly available 3D hydro codes that could be customized for modeling core collapse. Of course, we had almost zero experience in numerical modeling. We were basically just messing around. We looked into ZEUS-MP (but couldn't get it to work) and FLASH (but couldn't get it to work; though Sean Couch did about ten years later...). Through intense use of google (and, perhaps, even AltaVista), I eventually came across a web page sporting the following advertisement for the code GR3D:

GR3D -- now running at over 140 GFlops/s
Logo on the (now offline) Washington University web page offering GR3D for download.
GR3D was the first open-source code for 3D GR hydrodynamics and spacetime evolution, developed by a collaboration of scientists from the Washington University gravity group, NCSA/The Albert Einstein Institute (AEI), ANL, LLNL, and Stony Brook University. GR3D built on top of an early version of the Cactus Computational Toolkit, developed by Ed Seidel's group first at NCSA and later at the AEI. Cactus was one of the ultimate outcomes of the mid-90s NSF-funded Binary Black Hole Grand Challenge Alliance project that was supposed to solve the problem of merging two black holes in general relativity (but failed). In a follow-up project, the WashU/NCSA/AEI/ANL/LLNL/Stony Brook team was funded as part of a NASA High-Performance Computing Grand Challenge project to develop a multi-purpose 3D GR hydrodynamics code capable of simulating neutron star mergers. The result of this project was GR3D (see here for the final report preserved at the Internet Archive).

The website hosting GR3D has been down for a while, but thanks to the Internet Archive, one can still download the original tarball (or it might have been this one: GR3D_EOS.tar) that I downloaded in late 2001. Even the documentation (a ps file) is available.

I downloaded GR3D and managed to get it compiled and running on the Cray T3E and the (then considered huge) IBM SP3 "Seaborg" at NERSC. GR3D's documentation was helpful in getting simple test problems running, but there was no documentation for more advanced things such as merging two neutron stars let alone core collapse. That did not scare me. It was clear to me that the code I had my hands on could basically do what I wanted to do, I just had to teach it. After a few weeks of fiddling with it, hacking its input data routines and learning how to set up semi-consistent initial spacetime data for a collapsing star, I indeed managed to collapse a star -- running on 512 of Seaborg's processors and using a 384x384x384 3D Cartesian grid. Here is a movie that I produced using data from 2D density slices through the equatorial plane.


Well... the protoneutron star and the shock wave formed at bounce turned out to be cubical rather than spherical, but that's no wonder, given the ridiculously coarse resolution (computational cells ~15 km on a side...) I was able to use at the time. It became clear to me pretty quickly that I would either need to switch to spherical coordinates (to get higher radial resolution at constant angular resolution) or somehow get adaptive mesh refinement (AMR) into GR3D.

By now (early 2002), I had understood how GR3D and Cactus fit together. GR3D was based on Cactus version 3, but Ed Seidel's team at the AEI was already working on Cactus 4. I started e-mailing with the Cactus developers (Gabrielle Allen and Thomas Radke were my first contacts) and I signed up to the Cactus user's mailing list on January 24, 2002. They were quite impressed by what I had been able to do on my own with an old version of their code and got me in touch with Ed Seidel. Ed, who had worked on core collapse in 1D for his thesis, immediately offered me to collaborate with his team and/or make tools available. Here is an excerpt from one of the e-mails we exchanged (my lines have a > prepended):
>As I currently cannot proceed with my work on gravitational collapse due
>to the mentioned limitations of Cactus/MAHC, I would be glad to be able to
>make my research project available as a beta test site for an AMR thorn
>and possibly participate in its further development as well as
>help to implement the necessary adjustments into Cactus and MAHC. By
>the way: is MAHC the hydro code you are using?

No, we are developing a new version, based on our experiences with
MAHC and its predecessor, GR3D from the NASA code.  We are also
beginning to look into core collapse problems, and have many ideas on
adapting the codes for this purpose.

I would like to discuss them with you, in case you would like either
to help out, or to be involved in some way, or simply to know what we
are doing so we could make things available to you for your work.
Also, I think you'll find the Cactus group quite eager to help you if
you find things needed for your research that are not present in
Cactus.
The new code development mentioned by Ed happened in the context of an European Union Research Training Network (http://www.eu-network.org/) and involved researchers at the AEI, Valencia, MPA Garching, and SISSA (Trieste/Italy).

Ed invited me to visit the AEI in May 2002, where I first met Ian Hawke. Ian was the lead author of the new GR hydro code Whisky, which was based on the MAHC code that came with GR3D and built the foundation of the GRHydro code, which is now part of the Einstein Toolkit. Ed also brought in Harry Dimmelmeier, who had just completed his PhD thesis on rotating core collapse in 2D GR. This was the start of the Cactus Core Collapse Collaboration project, which, fast forward a few years, should turn into my PhD thesis.

In summer 2002, I returned to Heidelberg for my final year, in which I worked on my Diploma thesis on 2D Newtonian core collapse (for the first time using a complex equation of state). Ian and Harry came to visit me in early 2003 and we got together at the AEI again in the late spring. This time, Erik Schnetter joined us, who was developing the first version of the now widely-used open-source AMR driver Carpet for Cactus. Erik was enthusiastic about helping us make Carpet work for the core collapse problem and he has been one of my closest collaborators ever since.

When I was looking for places to do my PhD, the AEI was at the top of my list. Ed, who had just accepted the directorship of the Center for Computation and Technology (CCT) at Louisiana State University, let me choose: I could either come with him to LSU or work with his former group at the AEI, while collaborating with him and his new group at CCT/LSU. I went for the AEI option -- primarily, because I really really wanted to live in Berlin for a few years and, secondarily, because Ian Hawke was there and Erik Schnetter started at the AEI as a postdoc at the same time I would arrive as a PhD student. I spent ~3.5 years at the AEI in which I was trained primarily by Erik Schnetter and Ian Hawke under the hands-off mentorship of Ed Seidel and Bernard Schutz. Our work resulted in the first full 3D GR simulation of stellar collapse to a protoneutron star, Ott et al., PRL 98, 261101 (2007), and many subsequent papers. All the code developed as part of my PhD research on 3D GR stellar collapse is now part of the Einstein Toolkit, allowing others to reproduce our results and, importantly, allowing aspiring and industrious young students to approach new problems without having to re-invent the very basics.

The very fact that GR3D was publicly available jump-started my career. I got to know and have been able to work with amazing people, visionary computational scientists, who are deeply committed to advancing science with open source tools.

The morale of this story is: Make your code open-source, even if its documentation sucks or is non-existent. Some smart kid will figure it out and push the frontiers of science.

Saturday, November 3, 2012

General Relativistic Simulations of Three-Dimensional Core-Collapse Supernovae (arXiv:1210.6674) -- Meet the Team!

A bit more than a week ago, we uploaded our first full 3D core-collapse supernova paper to the arXiv (arXiv:1210.6674). Making the simulation code work, getting the simulations done, analyzing and understanding them, making the (I think, breathtaking) graphics, and writing everything up was a huge team effort. Ultimately, the paper will be referred to as Ott et al. (2012), which, I find, does not give enough credit to all the other team members whose contributions were crucial.  In this blog post, I want to introduce and give due credit to my team.

The author list: Christian Ott, Ernazar Abdikamalov, Philipp Mösta, Roland Haas, Steve Drasco, Evan O'Connor, Christian Reisswig, Casey Meakin, and Erik Schnetter.
(Important aside: You have probably noticed that there is not a single woman on the author list. The extreme underrepresentation of women in theoretical and computational astrophysics is a big problem that is hurting our field.)

My main contributions were: Defining the scope of the project, doing the actual simulations (which were carried out on supercomputers at NERSC, NICS, and TACC), overseeing the analysis and visualization, and writing the paper.

Ernazar Abdikamalov, Postdoctoral Scholar
Ernazar Abdikamalov, TAPIR, Caltech
Ernazar Abdikamalov --  Ernazar was the analysis czar for this paper and he also wrote the first draft of the section in the paper discussing explosion criteria. He wrote (together with Roland Haas) the analysis package and carried out the time consuming postprocessing analysis of our 3D simulation data. For each of our simulations, we kept about 5-8 Terabytes of output (in HDF5 format) and digging out the information needed for the paper meant reading in time slices of 3D output and running the analysis code on them. One full postprocessing run took about a day, but Ernazar spent weeks re-doing the postprocessing, since we kept on adding things we wanted to look at as the paper progressed.

Some background on Ernazar: I first met Ernazar (who originally comes from Uzbekistan) when he was a PhD student working with Luciano Rezzolla at SISSA at Trieste. I ended up collaborating with Ernazar on the final paper (on general-relativistic simulations of accretion induced collapse) towards his PhD thesis in 2009. He then went to work as a postdoc with my friends in the numerical relativity group at LSU's Center for Computation and Technology. Knowing the outstanding work he does, I brought him to Caltech first as a long-term visitor, then hired him as a postdoc, once I had the money to do so. His current primary research interest and specialty is radiation transport and we have a few papers together on this and other subjects.

Philipp Mösta, TAPIR, Caltech
Philipp Mösta -- Philipp was the plot czar for this paper. He worked very closely with me and Ernazar in making sense of the simulation results and presenting them in the graphs and 2D color maps shown in the paper. These plots were made with the open-source and Python-based matplotlib package and after the hundreds of revisions he made to the plots, Philipp is a true expert and knows how to make excellent plots (which is a skill that must be acquired by hard work). Philipp also participated closely in the discussions led by Ernazar and myself of what our results actually mean and how to present them in the paper.

Some background on Philipp: Philipp is a recent PhD (early 2012) coming out of Luciano Rezzolla's group at the Albert Einstein Institute (the Max Planck Institute for Gravitational Physics in Potsdam, Germany). I have known him since 2006, when he was at the AEI working on a numerical relativity code for his master's thesis. His PhD research was on electromagnetic fields and their interaction with merging binary black hole systems. He arrived at Caltech in November 2011 and is a postdoc in my group. His main interest right now are magnetohydrodynamic processes in core-collapse supernovae, but he continues to work on more fundamental numerical relativity problems and also helps with double neutron star merger simulations using the SpEC code of the Simulating eXtreme Spacetimes collaboration.

Roland Haas, TAPIR, Caltech
Roland Haas -- Roland was the key instrumentalist for this paper. Simulation codes are research instruments, in the same sense as instruments in experimental research. Instrumentalists in experimental physics (for example the instrumentalists who make LIGO work) always get credit for their work and it is time to extend this to computational research. Let's face it: Without people like Roland who know the ins and outs of the complex 3D codes that are now being used by us and other groups to model cosmic explosions (and many other things) our work would just not be possible. Our core collapse simulation package Zelmani is based on the open-source Einstein Toolkit, which in turn uses the Cactus Computational Toolkit (as its base infrastructure) and the Carpet adaptive mesh refinement driver. Roland is one of the key developers of all these codes (i.e. scientific instruments) and knows them by heart. His contribution was crucial for getting the code to work and ironing out infrastructure-related problems, in particular with the new multi-block grid that our simulations used for the first time. Roland also made important contributions to Ernazar's postprocessing work. They are best summarized by the little running joke we have in our group: "How does Ernazar program Cactus?" Answer: "He walks into Roland's office."

Some background on Roland: Roland received his PhD from the University of Guelph, where he worked with Eric Poisson on self-force calculations in general relativity. He then worked in Pablo Laguna's group at Georgia Tech, where he became involved in the Einstein Toolkit. In 2010, he won a Canadian NSERC postdoctoral fellowship, which he took to Caltech in the fall of 2011. His main current research is focused on the double neutron star merger problem, which he is tackling with the SpEC code. He has also been working closely with Christian Reisswig on the new multi-block extension to Zelmani.

Steve Drasco, Grinnell College
Steve Drasco -- Steve did all the 3D rendering and generated 2D entropy colormaps shown in the paper. He is also the guy responsible for the movie of one of our simulations that's available on the Simulating eXtreme Spacetimes YouTube channel. This was extremely time consuming, but also extremely important work, which he carried out with the open-source VisIt visualization package. Steve not only rendered the final simulation output, but was frequently called to help to diagnose potential (and real) problems as the simulations were progressing. Thanks to him, we identified a problem appearing when the supernova shock gets too close to AMR boundaries. Had we not looked at his plots, we would not have noticed it and would have wasted a huge amount of supercomputer time.

Some background on Steve: Steve is a relativist by training and received his PhD from Cornell, working with Eanna Flanagan. His main research focus has been on extreme and intermediate mass ratio binary black hole inspirals. He came to Caltech/JPL as a postdoc in 2005, moved to the Albert Einstein Institute in 2008, was a lecturer at CalPoly San Luis Obsipo from 2010 until this June. He is now an Assistant Professor of Physics at Grinell College and has a visitor appointment in my group in TAPIR. Steve has always had a great talent for visualizing his work. Some of his stuff has recently been featured in Physics Today. Also check out his cool movies of these inspirals. Steve loves fast cars and owns two Porsches.

Dr. Evan O'Connor (right) and Christian Ott (left).
Evan O'Connor -- Evan was the microphysics expert for our paper. He put together the equation of state table used in our simulations. Evan also developed the leakage scheme originally used in his code (GR1D) and helped porting it to 3D and integrating it in Zelmani. In order to set the stage for our 3D simulations, he ran a suite of spherically-symmetric simulations with GR1D for us to build some intuition for the overall collapse and postbounce dynamics of the 27-solar-mass progenitor that we used in our simulations.

Some background on Evan: Evan is the first PhD that I produced and I am as proud as a "parent" could possibly be! The picture on the right shows Evan and me on the patio of Caltech's Cahill Center for Astronomy & Astrophysics right after his defense. The hat that we made for him says "Licensed Supernova Theorist". Evan graduated this June and, since September 1, he is a postdoctoral fellow at the Canadian Institute for Theoretical Astrophysics at the University of Toronto. His main current interests are neutrino transport, neutrino microphysics, and the nuclear equation of state. Check out his publication list. It's already quite impressive!

Christian Reisswig, TAPIR, Caltech
Christian Reisswig (The Reisswig) -- Christian's role in this paper was that of an instrumentalist. He is one of the co-developers of the Llama Code, an extension to Cactus and Carpet that latches multiple grid blocks with logically Cartesian, physically curvilinear coordinates onto Carpet's AMR grids. This is a great technical advantage over previous, purely Cartesian 3D approaches, because it allows us to push the outer boundary of the computational domain out to very large radial distances while keeping the angular resolution fixed and thus saving a tremendous amount of memory and supercomputer time. Christian is also an expert of our Carpet AMR driver and has been working very closely with Roland Haas and Erik Schnetter on improving various aspects of it. He is also our lead optimizer: We could not have run our simulations without his improvements to the parallel performance of our code

Some background on Christian: I have known Christian since ~2005, when he arrived as a master's student at the Albert Einstein Institute (where I was a PhD student at the time). Already back then, he played around with an early multi-block infrastructure. In his PhD research (which he concluded in 2010), he worked on binary black hole inspiral and merger simulations and became one of the leading experts in extracting gravitational waves from such simulations. He arrived as my first postdoc at Caltech in early 2010 and was awarded an Einstein Fellowship in 2012. His main research interests right now are black hole formation, his magical next-generation Simsalabim high-performance computing infrastructure (ask him about it after he has had a few drinks!), and rapidly rotating core collapse.

Casey Meakin, LANL
Casey Meakin -- Casey's contributed to our paper by working with Ernazar and me on the analysis of neutrino-driven convection in our models. He also worked out what the typical magnitude of perturbations (local deviations/fluctuations from the spherical background) one would expect to be presented in the core of the 27-solar-mass presupernova star whose collapse we simulated. This was crucial, because it gave us a quantitative handle not only on the size of these physical perturbations (which will seed convection), but also on the time they will reach the stalled supernova shock. This helped us to argue that the relatively large numerical perturbations that our simulations experience (from AMR boundaries and the Cartesian grid) are comparable to what one would expect from convective burning in silicon core and shell burning.

Some background on Casey: I have known Casey since 2001 and he is one of my few close long-time friends. We were both graduate students at Steward Observatory at the University of Arizona (well, I was sort of an "exchange student") at that time working with Adam Burrows. Casey subsequently worked with Dave Arnett on multi-dimensional stellar evolution and graduated in 2006. He was a postdoc at the Flash Center at the University of Chicago, then went back to Arizona for a while and is now a Scientist in the Theoretical Division of Los Alamos National Laboratory and adjunct faculty at Steward Observatory.

Erik Schnetter, Perimeter Institute
Erik Schnetter -- It is hard to adequately summarize Erik's contribution to this paper; his work on Cactus and, in particular, on the Carpet AMR driver is enabling everything we do with Zelmani. Perhaps, I can describe him as the senior instrumentalist in our team. His instant messenger screen name is eschnett247 -- and he has been a tremendous near-24/7 resource for infrastructure help and guidance on how to get our simulations to perform well on various supercomputers. Earlier this year, I think it was in May, when we were gearing up to start our simulations, Erik dedicated an entire weekend to helping us in improving parallel scaling of the production-level simulation setup. We reserved half of our local Caltech supercomputer (which is called Zwicky, in honor of late Caltech Astrophysicist and supernova hunter Fritz Zwicky), my crew at Caltech got together, we established a video conference with Erik and off we went on hacking, breaking, fixing, and improving Zelmani. Erik also did a tremendous amount of extremely difficult code design and development that enabled our simulations: He designed and implemented cell-centered AMR in Carpet and a mechanism called "refluxing", which allows for exact conservation of mass, momentum, and energy fluxes at AMR boundaries.

Some background on Erik: I have known Erik since 2003 and he is another very close, long-time friend. I met him when he was a postdoc at the Albert Einstein Institute and I was a graduate student there. Erik (together with Ian Hawke) basically trained me in numerical relativity and computational science. In 2005, he moved as a Research Assistant Professor and Staff Scientist to the Center for Computation and Technology at LSU and in 2010 he became Research Technologies Group lead at the Perimeter Institute in Waterloo, Ontario, Canada. He is also adjunct faculty at the University of Guelph.