2

note: To preserve @MichaelClark's substantial answer below, I've volunteered to leave this question here and allow it to be closed, instead of just deleting it (see discussion below the answer). That means I'm volunteering to eat the downvotes, so please take a moment to consider it before casting one. Thanks!


Is it possible to reconstruct an approximation of the original raw sensor data from downloaded iPhone 6 images by interpreting the meta-data and mathematically undoing whatever was done in the phone?

I understand this is not the best way to proceed, but I would like to keep my phone app-less, and I believe at this time I can not access the raw data in the phone without a 3rd party app.

I generally use Python for everything I do, but if there is free software I could consider it. If the math is out there I can write my own script as well. I'm looking to understand what I'm doing more than finding a quick solution.

If I see something interesting or useful with this, I'll bring in a DSLR later.

What I'm planning to do:

I would like to look at relatively small color differences between two regions of an image, and see if the shift increases or decreases between one photo and another. For example, in each image I'll define two rectangles, then I'll just calculate r = R/(R+G+B) where R, G and B are each integrated within the rectangle. So r1 and r2 might be 0.31 and 0.32, and I'll say there's a shift of 0.01.

The images are fairly flat - white paper reflecting ambient light. I'll maintain the same distance and angle. (note: I'm only asking here about the data processing step so I'm only loosely describing this to better give an idea how I'll use the data, not how I'll interpret it.)

I'll do the same for the next photo, and see if the shift is larger or smaller. This is "science stuff" but not serious science stuff and I understand this is not the best way to proceed.

uhoh
  • 1,817
  • 15
  • 29
  • What, exactly, are you attempting to accomplish by comparing differences between various images of white paper reflecting ambient light? Are you trying to measure the color of the ambient light? The intensity? The reflectivity of the paper in terms of color and/or intensity? Are you trying to define the texture of various white papers? – Michael C Jan 31 '17 at 05:09
  • 1
    If you don't want to use a 3rd party app, and you know how to program, why not write your own app to get RAW images from the camera and do what you want with them yourself? – user1118321 Jan 31 '17 at 05:09
  • @user1118321 I can script in python, that's pretty much my limit. There may also be security issues with cluged, rookie apps on phones. – uhoh Jan 31 '17 at 05:56
  • 1
    I've voted to close this question as unclear what you are asking because it is an xy problem. – Michael C Jan 31 '17 at 06:09
  • I agree and would vote to close as well if I had sufficient rep. – uhoh Jan 31 '17 at 06:28
  • Apparently by volunteering to allow to be closed instead of just deleting requires that I also eat drive-by down votes :) – uhoh Jan 31 '17 at 09:11
  • 3
    I'm voting to close this question as off-topic because it is about using a camera as a measuring device rather than about photography. – mattdm Jan 31 '17 at 09:17
  • @mattdm there is a whole field of scientific and medical photography you are excluding. One could be documenting wildlife or bacteria, counting numbers or gauging movement. Photogrammetry. If that's the wish of this stackexchange community, then if it isn't already, maybe it can be explicity codified? Are you thinking that only objectively aesthetic photography questions should be considered on-topic? – uhoh Jan 31 '17 at 09:40
  • @mattdm astrophotography of Saturn's moons is on-topic, but if you are going to plot their movement then it's off-topic? – uhoh Jan 31 '17 at 09:46
  • 1
    Imagine a function y=sin(x) and say it is RAW data. Then imagine series a(n)=sin (npi)=0 and say it is JPEG. What you are tyring is interpolation of unknown function from a infinite set of zeros. First match: f(x)=0. Experienced guess: f(x)=asin(bx), b=pi, a="anything". – Crowley Jan 31 '17 at 10:01
  • @mattdm I've asked here. – uhoh Jan 31 '17 at 10:09
  • @Crowley I enjoy marhematical discussions very much, it is hard without Stackexchanges MathJax support and probably not appropriate for comments on a closing question. Please wait until I re-post the improved question. Thanks! – uhoh Jan 31 '17 at 10:14
  • @uhoh Yes? I mean, if you're using a camera to take photographs of nails, that's on topic. If you are using that camera to pound the nails into boards, off topic. – mattdm Jan 31 '17 at 13:37
  • @mattdm I don't understand how your analogy - taken at face value - applies here. – uhoh Jan 31 '17 at 13:43
  • 1
    The test for topicality is really simple: if it's about the art and science of making photographs, then it's on topic. If it's about using photographic equipment as a tool, where the desired result isn't anything to do with photography per se (in your case, measurement data), it's off topic. – mattdm Jan 31 '17 at 14:19
  • 1
    In this specific case, it's particularly important, because your question title ("approximating raw image data") might have general photographic applicability, but you don't actually care about that (see your later comment, "I just mean to approximately undo some of the transforms and correction"). I don't mean that in a bad way; it just makes for a misleading question. What you want to do is not the same as what someone who is aiming for a photographic result might want to do. – mattdm Jan 31 '17 at 14:21
  • @mattdm Hmm... I see the act of approximately undoing some of the transforms and corrections as approximating raw image data Maybe using "data" to refer to a digital image is the problem. I have to confess I see everything digital as data, even if it is also art. If you think this might be an issue, can you recommend the change to the wording you'd recommend? – uhoh Jan 31 '17 at 14:36
  • As I understand it, you want to increase grayscale resolution by reversing the color interpolation. Is that correct? (The common reason people want to "undo" conversion to JPEG is to make white balance corrections, apply a different tone curve, or to gain dynamic range; does that help explain what I'm getting at?) – mattdm Jan 31 '17 at 14:56
  • @mattdm why don't I do some reading on what you've just said, give it some thought, and answer within a day. That might be true, and if I can understand that better, that may be very helpful. – uhoh Jan 31 '17 at 14:59
  • 4
    @uhoh Works for me. :) I don't mean to be over-pedantic, but I really do want to keep this site's Q&A focused. If it's 90% engineering and photometrics, that scares away people interested in composition, lighting, storytelling, etc. That doesn't mean there can't be highly-technical questions; I'd just like to see that within the frame, as it were. We can discuss this more on http://meta.photo.stackexchange.com/, if you like. (Note also that I'm not anyone special here; just a guy with strong opinions.) – mattdm Jan 31 '17 at 15:06
  • @mattdm I'm still working on it, thanks again for the suggestions. – uhoh Feb 06 '17 at 02:17

2 Answers2

6

Nope. The reason is that all of the raw data is not represented in any single interpretation of that data used to produce a JPEG image. Much of the raw data has been discarded and can not be recovered.

When you open a raw image file with an application what you see on the screen is not the unfiltered, raw data. It is one interpretation of the raw data based on the default settings of the application doing the rendering. If the settings are changed, the application will go back and recompute the image using the raw data to produce another, different interpretation of the same raw data. But the full set of the raw data is not represented in either of the two interpretations. Only some of the information in the raw file will be used for each rendering.

(Yes, it is theoretically possible to produce an image with such a limited range of brightness and color that the information in the raw file is so uniform from one pixel to the next that it can all be represented in an 8-bit image. But who wants solid pure white or solid black photos? When taking a photo of a scene that contains areas of differing brightness, colors, and textures the total data in the raw file almost certainly exceeds the capability of an 8-bit display or 8-bit image format.)

If there is any change in brightness or color from one pixel to the next, the gradations will be much finer/smoother with a raw file than with a JPEG. 8-bit jpegs only allow for 256 discrete values for each color between totally dark and totally bright. A 12-bit raw file allows for 4,096 discrete values per color. A 14-bit raw file allows for 16,384.

If the values along a line of pixels increase by one for each successive pixel as we move from left to right of a 14 bit file then there would be 64 discrete values for a line 64 pixels long. But there would only be one change in value of one unit along the same line of 64 pixels with an 8-bit file. Now let's say we have another line of 64 pixels in which the values in the 14-bit file are more irregular. They go up and down as we move right to left, but they all stay within those same 64 discrete values as the first line. Once converted to 8-bits we wouldn't be able to tell the difference between the first line and the second. There are 64 possible values in the 14-bit raw file that all transform to the same value in the 8-bit jpeg. Now multiply those 64 possible values by every pixel in the image. That is how many possible different versions of a 14-bit raw file would be represented by the exact same 8-bit jpeg. If you only have a 1 MP image, that is 64 raised to the power of 1M (64^1,000,000)!

For more about what kind of information is contained in a raw file, please see: RAW files store 3 colors per pixel, or only one?

Michael C
  • 175,039
  • 10
  • 209
  • 561
  • To double check, you understand I just mean to approximately undo some of the transforms and corrections by applying the inverse transforms in the right order, and that I am only looking for an approximate result. Of course it can not be done exactly, but is there enough metadata (or other data) to at least get a handle on things like gamma, sRGB, or similar? I would just implement in python so floating point limitations in software might be avoided. – uhoh Jan 31 '17 at 05:28
  • You'll never get more information out of a jpeg than what is in there. If you know the exact algorithm (good luck getting that out of Apple) that was used to transform the raw sensor data to the 8-bit jpeg, you're still unable to produce gradations smoother than the 256 discrete value limit imposed by 8-bit. That is severely limiting compared to 12 or 14-bit raw data. The R, G,& B multipliers and such are usually included in EXIF data from most dedicated cameras. What, if any of that information is included in the EXIF data of an iPhone photo I know not. – Michael C Jan 31 '17 at 05:38
  • Since I'm just going to look at the average color in a, say 200x200 pixel area, and look for changes of the order of a percent, I am looking to get very substantially less information than is there. I think your answer is very helpful for professional photographers and complex images, but I've written the question in such a way as to make it clear that I am asking something different. – uhoh Jan 31 '17 at 05:43
  • Even if you have everything included in the EXIF standard, you're still limited by the 8-bit depth of your data. If you try to undo gamma curves, for instance, you're not going to get a linear straight line response, you're going to get an irregular, chunky looking stairstep response. – Michael C Jan 31 '17 at 05:43
  • It seems like you've written the question as an xy problem, instead of really asking what it is you are trying to accomplish. – Michael C Jan 31 '17 at 05:44
  • I will be careful how I handle the averaging of the 40,000 pixels. I've tried to explain in the question that I am not asking if the results wi be good enough for my purposes, and I have said that if I do see something of interest that I will then bring in a DSLR. – uhoh Jan 31 '17 at 05:47
  • Yes, indeed it seems that is exactly what I am doing! OK I can rewrite this as an X question in about 2 hours when I get to a real keyboard. Would it be OK to leave this here until then or should I delete now and undelete then? Thank you for your help by the way! – uhoh Jan 31 '17 at 05:54
  • or you can close it and maintain this answer, and I'll write ask a separate question. – uhoh Jan 31 '17 at 06:02
  • 1
    Close is probably better than delete. – Michael C Jan 31 '17 at 06:07
  • That's fine. You're welcome to edit the question if it makes the pair more useful/informative in the future, clean up comments, etc. – uhoh Jan 31 '17 at 06:16
  • apparently volunteering to allow to be closed instead of deleting requires that I also eat drive-by down votes :) – uhoh Jan 31 '17 at 09:10
  • Nope was the correct answer. – WayneF Jan 31 '17 at 16:49
2

As Michael Clark has explained in detail, you cannot reconstruct the raw file because the mapping from the raw file to the JPEG is not one to one, there are many different RAW files that could have been processed into exactly the same JPEG file. However, this does not mean that all the possible raw files that are consistent with some given JPEG file are equally likely. It is possible to reconstruct the most likely raw file given a JPEG image, this requires doing computations that are quite similar to doing state of the art noise reduction computations where you want to eliminate the noise in a picture by calculation the most likely noise-free picture. Such methods are enormously computationally expensive (many hours or even days on a fast computer to process a single picture), but they usually do yield much better results than the usual ad hoc methods.

In case of JPEG, what you should try first is to simply decompress a JPEG better than using the decompression algorithm. A naive JPEG decompression does not yield the most likely image that upon JPEG decompression yielded the given JPEG file, obviously not, otherwise there would be no JPEG artifacts visible.

Count Iblis
  • 3,616
  • 1
  • 13
  • 17
  • Thanks - I think you are focusing mainly on the decompression of spatial aspects. I'll be choosing areas, say 200x200 pixels, that are almost featureless, and trying to approximate the average R, G, and B levels that would have been in the raw within these small areas. I think I can appreciate the difficulty of what you are talking about, it sounds like a computationally expensive but relatively straighforward maximum likelihood problem. – uhoh Feb 01 '17 at 04:08
  • By limiting to small patches of nearly uniform color and luminance and ultimately going after only the most likely average sensor R, G, and B levels that would have been within them, any down-sampling of the chroma channels would not have a major impact. Today I'll try to "go deep" and read further. I'll start with the answer(s) to RAW files store 3 colors per pixel, or only one? linked by @MichaelClark. – uhoh Feb 01 '17 at 04:19