The Big Bad Headache

This blog has been a huge help for me. The reason why my posts are so long is because they are more of a thinking tool for me than they are a blog. When I write down my thoughts, things become a little clearer for me. This however, means that what I write takes a lot more effort for others to read, and it doesn’t make much sense to anyone but myself. In today’s post, I will try to provide a more comprehensible explanation of what i’m facing.

I may throw in some random bits that are just my thoughts and are not necessarily vital to the task at hand. If I do, they will look like this. Feel free to skip these.


 

The problem is that I do not know if there is a a way to test whether a or not a URL leads to a working image, before it is added to the array and then loaded in the browser. I’m not even sure where to start looking for answers because I’m not sure what to ask. Is this an issue of Computer Networking, Telnet, and Servers? Or is it something else?

Now, I mentioned before that solving this mystery would make a not-very-useful feature a possibility. That feature being, having a set number of images appear, irregardless of the number of images that fail to load. But that wouldn’t be the only feature that solving the mystery would bring to fruition. I realized that solving this problem would also enable us to add more wavelength sets. This would be huge because it would enable us to provide software updates to our users that include new content (new wavelengths) for them to explore.


 

To begin, I will introduce the the SDO image URLs and how they are generated.

The URLs of the images generated by the SDO follow a specific format:

YYYYMMDD_hhmmss_RRR_WAVE.jpg

  • YYYY = Year
  • MM = Month
  • DD = Day
  • hh = Hour
  • mm = Minutes
  • ss = Seconds
  • RRR = image size
  • WAVE = wavelength type

***If you want to explore the SDO gallery yourself, feel free!: http://sdo.gsfc.nasa.gov/assets/img/browse/

Let’s use an example to explore each part of the URL:

url

fig 1. Take a look… Pretty straight-forward right?

We can see that the image was either taken or uploaded at midnight on December 25, 2013. We also know that the image size is 256x256px, and that the image is of the sun of the wavelength HMIB.

Now let’s take a look at how our code works

The code can generate the image URLs because a few parts of the image URLs from the SDO are very predictable. Take a look at the image URL from 12/25/13 and compare it to the 12/26/13:

fig 2

20131225_000000_1024_HMIB.jpg

20131226_000000_1024_HMIB.jpg

They look nearly identical right? The only difference is one digit. 20131225 versus 20131226. So the only thing that seems to change would be the six digit date stamp. So we can simply have the code increase the date accordingly with consideration of changing months and years, but keep the “_000000_1024_HMIB.jpg” the same as it is right?

Nope. Unfortunately, the URLs are not always this nice and consistent. Only 4 other wavelength types share this same consistency. The other wavelengths however, are not so predictable. Take a look at the URLs of wavelength 0131 on two separate days:

fig 3

20131225_000034_1024_0131.jpg

20131226_000046_1024_0131.jpg

While the date incremented as expected, the six digit time stamp is what changed. While the two HMIB images were either taken or uploaded at exactly midnight, the two 0131 images were taken/uploaded at different times from each-other; one was taken/uploaded 34 seconds after midnight, and the other was taken/uploaded 46 seconds after midnight. Take a look at a few more from a more recent month, and you will notice that there is no predictable pattern of when the 0131 images are taken/uploaded:

URLs0131EXAMPLE

fig 4. 001146, 001246, 001234, 001222, 001158, 001210… As far as I can tell, these time stamps are just random.

 

Side Thought (not essential to the task at hand, you can skip this if you’d like) Also, you are probably wondering why I have been saying, “taken/uploaded.” ‘throughout this post, instead of just one or the other. Take a look at the date and time under “Last Modified,” and compare it to the date and time stamps of the corresponding URLs. You will notice that the “Last Modified” date and time is always before the URL date and time. I am guessing that “Last Modified” indicates the time at which these images were taken, while the image URL represents the time that the images were uploaded. But I may be wrong because I have no way of being sure that the clock on the SDO is in sync with the clock of the wherever the images are being uploaded. What time zone is the SDO operating in? GMT like the ISS?

There also seems to be no correlation between the “Last Modified” date and time with the image URL’s date and times. All images share the same “Last Modified” time of 20:18, but none of them share the same URL time stamp. Because of this irregularity, the code cannot predict the URLs of the images, and therefore, we are unable to add most of the wavelength image sets that we would like to. It is a shame because these wavelength image sets that we cannot utilize are the ones that are the most visually appealing.

Bk7U5R3CQAAkoSf

None of these beautiful wavelengths can be used, unless we find a solution…

I have thought about ways to include these wavelengths for awhile now, and there are two solutions that I have arrived at, and B. As we continue, keep in mind that for a time-lapse, we only need 1 image per day…

Solution A. If you take a minute to examine the time-stamps of the URLs again, you will notice that while they do not exactly match each-other, they are all within a certain range. In fig 4, the date stamps of the URLs always seem to fall between 001000 and 001300. Using this trend, I could write a for-loop that would:

  1. Start at 20140610_001000_1024_0131.jpg
  2. Test to see if that URL leads to an image.
    1. If the URL leads to a broken image (404 not found), then the code will just continue to step 3.
    2. If the URL leads to a working image, then add the URL to the array (list of working images). Then the code will skip to step 4.
  3. The testing URL time-stamp is incremented by 1, to 20140610_001001_1024_0131.jpg, and then step 2. is repeated.
    1. This will continue until either a working image URL is found, or the for-loop reaches 20140610_001300_1024_0131.jpg. If it does reach 001300, then it will continue to step 4.
  4. The Testing URL date-stamp is incremented by 1, to 20140611_001000_1024_0131.jpg., and step 2. is repeated.
  5. This for-loop will run until the array is filled with X-number of working URLs (which translates to X-number of days).

(Note that “20140610” is just an example date-stamp. It will be determined by the date on which the program is loaded)

The only thing preventing me from writing this code is step 2A. and 2B. As you know, I do not know if there is a a way to test whether a URL leads to a broken image or a working image, before it is added to the array, and then loaded in the browser…

Solution B. The other solutions is somewhat of a semi-fix. We could simply download 100 images (days) of each of the currently unobtainable wavelengths, and load it directly into the browser. This means that the Sun under these wavelengths will never be updated, and the user can only view a time-lapse of the sun within a permanently fixed interval. After all, the purpose of this project is to create an educational tool, not so much of a research tool. A school teacher does not really need a “live” version of the Sun in order to point out certain features such as solar flares and sun spots to a class. Having a “live” projection of the Sun is a cool, but not terribly essential feature.

We could provide the functioning 5 wavelengths, but also have a separate mode that allows the users to view the other wavelengths that are not “live.” This does have some benefits too. Because the images will be downloaded onto the laptop, these wavelength sets of the Sun can be viewed without access to the internet.


I would prefer solution A, but I do not know if it is even possible. Maryam got me interested in learning more about Computer Networking and Servers, so I will be able to learn a lot along the way while chasing solution A. However, solution B may be the most practical approach, especially since the outcome will be certain, unlike solution A…

I still have to think things through a bit more before I make a decision… If you read this far, I want to thank you for taking the time to learn about my problems. I know it is not your job to solve my problems, but any help at all is welcome and immensely appreciated!

Thanks!~

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s