What is Mirage and why should I get it?

Mirage is a data visualization tool. It has many nice features, but the most powerful is its multiple simultaneous views of your data. As a concrete example, let's say you've plotted a histogram of some quantity using your favorite tool, and you notice a few outliers. What are they? Are they spatially clustered? Are they also outliers with respect to some other quantity? To find out, you'll have to filter them out of your catalog and make another plot using that subset of the data. Mirage lets you work much more efficiently. You can select a chunk of the histogram and have that selection immediately broadcast to a scatterplot. It lets you do science without worrying about the plumbing.

OK, how do I get Mirage?

Note: Mirage is written in Java, so it runs on almost any operating system. The conventions used here are Unix-style, but Windows users will easily be able to interpret.

  • First, go to the Mirage home page and download the latest version by following the link to "Bell Labs software distribution web site", and following the steps therein. Create a directory to keep it in, say "/usr/local/mirage" (it could also be under your home directory) and save the binary to a file in that directory, say "mirage0.2.tar.gz". Then unpack it by going to that directory and running "tar xzvf mirage0.2.tar.gz" (note: version 0.2 will be available by April 14, 2003).

  • Make sure you have java version 1.3 or later: run "java -version". If not, consult your local guru to find out how to upgrade.

  • To run it, use the command "java -jar /usr/local/mirage/Mirage0.2.jar" (or whatever the path is on your system). Power users may wish to add an option like "-Xmx750000000" to indicate that Mirage can use up to 750 MB of memory. When working with large datasets, more memory speeds things up. In any case, you will see the popup pictured below.

How do I use Mirage?

Here's an example based on the DLS data.

Getting DLS data

  • Click "Cancel" on the "Load dataset" popup. Don't worry, this does not exit the program. It simply lets us get to a more sophisticated data loading option.

  • You should now see a blank canvas with three menu options at upper left. Choose Console->New Dataset via HTTP. On the first line (Server), choose DLS from the menu at right. Click on "Get information from server" to see if all the plumbing is working. You should see this:

    (Click on low-resolution images to see them at full resolution.) Some types of firewalls may prevent the outgoing connection. If you need to use an HTTP proxy in everyday browsing, this means you.

  • Now enter a real query in the "Send query" box. Try
    SELECT ra,decl,magb,magv,magr,magz FROM photo WHERE (magb < 25)
    
    as shown in this screenshot:

  • Now click on "Get data from server". After a brief pause you should see this in the main window:

    You can now quit the HTTP Options popup.

Using Mirage to examine the data

  • What you just saw was the table view, which is fairly boring if you have a large dataset. Using the tabs at top center, you can switch between table, histogram, scatterplot, and feature vector views. Try it.

  • But the best view is multiple simultaneous views. Click on the tab labeled "1" and you get all four at once:

    You can create more tabs as you like, with arbitrary configurations of multiple plots, by using the buttons running down the left side of the view.

  • Because RA and DECL are the first two columns of the table, the histogram defaults to a histogram of number versus RA, and the scatter plot defaults to DECL versus RA. The histogram is not very interesting, so make it refer to some more interesting quantity. In the histogram plot, pull down the menu labeled "ra" and change it to "magr". Now you see number versus R magnitude, which really tells you something about the data. (But remember, we asked for magb<25 in our SQL query, so we are not seeing the true incompleteness.)

  • Now try the broadcast feature. First highlight a subset of the data by clicking on the red rectangle icon at right, then highlighting some section of the histogram:

    Now click on the broadcast icon at lower right: , and after a few seconds, the other plots will show your selection:

  • What does this tell us? First, looking at the scatterplot, you can see that this magnitude slice is uniformly distributed in RA and DECL, which is good (the holes are due to very bright stars). Second, the table view doesn't tell you much, but it might if the data subset were smaller and you wanted to browse through all the items. Third, the feature vector plot (upper right) requires some more explanation.

  • In the feature vector plot, click on this button: (the tool tip says "show values of all feature vectors"). Then the plot will look like this:

    This is a true feature vector plot rather than just the range of the data. Each object is plotted by a set of dots representing its ra,decl,magb,magv,magr, and magz from left to right, and the dots are connected by line segments. (The actual values are unfamiliar because the units have been standardized. Also, the dots and line segments can get so numerous as to form a solid area.)

    So the second-from-right column is R magnitude, and you can see that our selection is a very narrow range of R magnitude, simply because that's how we defined our selection in the histogram plot. What's new is that you can see that this narrow slice of R corresponds to a large range of z and V magnitudes, a not-so-large range of B, and the full range of ra and decl represented in the catalog. In addition to just the ranges, you can see by the density of points that MOST of the this narrow slice of R corresponds to a narrow slice of V, but there are a few outliers, whereas in z there is somewhat more of a scatter.

    If this selection in R truly corresponded to a narrow range of B, this would be new astrophysics, but remember that we selected for magb<25 in our SQL query, so this is an artifact. But it illustrates how to use the feature vector plot to investigate trends in the data.

What if I don't want to do an SQL query? I just want a big local file with all the data.

  • Download and gunzip a DLS catalog, say F1p22.cat (your browser may do the gunzipping for you).

  • Get the format file corresponding to your catalog: F1p22.fmt or F4p22.fmt or Release2.fmt. A format file simply tells Mirage about the structure of the catalog.

  • In Mirage, choose Console->New Dataset with Options. You get a more sophisticated Load popup now:

    Fill in F1p22.cat and F1p22.fmt on the first two lines (this has already been done in the image above), and click Load.

  • After a loading progress bar is done, you get a table view much like the table view shown under "Getting DLS data", and now you can skip to "Using Mirage to examine the data".

Does Mirage work with images?

Yes. Drag this symbol (you can find it along the left-hand side of the canvas) onto any plot, and you will see this popup:

Give it a name such as F1p22BVR.jpg, and click on import (we'll get to the row/col identifiers later). Then you'll see the color image in the area previously occupied by the plot. You can use these buttons found on the top right to manipulate it: . From top to bottom, they pan the image as you drag it with the mouse; zoom in; zoom out; realign the image with the top left corner; and fit the image to the window. To do any of these, you first click on the button and then click on the image window.

If you move the cursor onto the image and leave it for a second, the x,y coordinates of that point on the JPEG pop up. Note that the display convention is to put the origin at upper left. x,y on the JPEG is not the same x,y coordinates found in the catalog, because of this convention and because the JPEGs have been binned to make them a managable size. If you wish to make overlays, you must first replace the x and y columns in the catalog with x/2 and (8192-y)/2. Then in the Load Image popup, make sure to enter Y for row identifier and X for column identifier (this is tricky because "row identifier" appears first and habit is to put x first). Then when selections are broadcast, overlays will appear as yellow circles on the JPEG:

We are working on a way to associate RA and DEC with a JPEG, so that these coordinate system problems will disappear.


The author of Mirage is Tin Ho. You can find her contact info and lots more Mirage documentation at the Mirage home page.

Last updated April 11, 2003.