Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't delete .RAW photos #32

Open
adjordan opened this issue Sep 13, 2024 · 5 comments
Open

Can't delete .RAW photos #32

adjordan opened this issue Sep 13, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@adjordan
Copy link

Pro Pixel phones have a feature to maintain the "RAW" version of a file, meaning that I can see two versions of each photo in Google Photos - the standard ".jpg" and a "RAW" file, usually with extension ".dng".

These photos are huge (>40 MB) and have distinct names, but I have been unable to delete these photos using the size or filename filter.

For instance, here are two photo names of the exact same picture:

  • PXL_20240526_162513871.RAW-01.COVER.jpg (8.5 MB)
  • PXL_20240526_162513871.RAW-02.ORIGINAL.dng (46.3 MB)

I would expect that if I filter by size > 40000000 bytes or by filename ".dng" it would delete this photo, but neither finds this photo.

@xob0t
Copy link
Owner

xob0t commented Sep 24, 2024

I'm not able to replicate this.
I've set up an album with two images, one png, other is dng.
I filtered it by filename with regex .*\.dng
The dng was identified.
image

@ringerc
Copy link

ringerc commented Nov 21, 2024

I think the issue is with Google Photos storing a .JPG and associated raw as a single "photo" - it now effectively combines them into a photo-stack.

I was just trying to delete just my RAW images too, since Google "improved" Photos to now upload RAW with no option to skip doing so. I was unable to delete the raw images only using this tool.

The underlying filenames still exist but Photos manages them as a stack.

E.g. on the phone I have PXL_20241004_234853957.RAW-02.ORIGINAL.dng for a raw image, and PXL_20241004_234853957.RAW-01.COVER.jpg for the JPEG.

In Photos this renders as a stack like
image with individual files PXL_20241004_234853957.RAW-01.COVER.jpg (the default image for the stack) and PXL_20241004_234853957.RAW-02.ORIGINAL.dng (secondary image for the stack). They have distinct URLs https://photos.google.com/search/PXL__20241004__234853957/photo/AF1QipMHLzqMfMmHXiySfSHUW7vI7veAhjifC1ZxPFo_ and https://photos.google.com/search/PXL__20241004__234853957/photo/AF1QipPfUNrxRoEmFZ7oKeO_YDxXXtVmAB_GhLUjZadk respectively.

I'm guessing the tool doesn't see images in stacks other than the top image, which is why a regex like .*\.dng$ doesn't match.

If I run a job to match .*\.dng$ and move to trash, it matches standalone .dng files I uploaded from before Google "improved" Photos, but does not match those that are part of "stacks" from after the "improvement".

E.g. this log

 Filter: All media with filename matching regex ".*\.dng$" 
Start Time 16:21:55
Reading library
Found 338 items
Found 107 items
Found 60 items
Found 163 items
...
Found 188 items
Source read complete
Found items: 58005
Getting items' media info
Processing 5000 items
Processing 5000 items
Processing 5000 items
Processing 5000 items
Processing 5000 items
Processing 5000 items
Processing 5000 items
Processing 5000 items
Processing 5000 items
Processing 5000 items
Processing 5000 items
Processing 3005 items
Filtering by filename
Item count after filtering: 1
Items to process: 1
Moving 1 items to trash
Processing 1 items
Task completed in 00:03:35

which moved only one .dng to trash, that I had uploaded separately rather than as part of Photos auto-backup feature.

(Google seem to have put a lot of effort into making it as hard as possible to avoid backing up your raw .dng files, and as hard as possible to delete just the .dng later. True ensh1tification at work again.)

I think it does this stacking based on filename, in which case reproducing it shouldn't be too tricky. Otherwise let me know and I'll try to grab some data from the model.

@ringerc
Copy link

ringerc commented Nov 21, 2024

I modified the item loading code to log items found and captured a few samples.

It looks like the initial results don't include filenames. Another function extendMediaItemsWithMediaInfo fetches that when required, and itemBulkMediaInfoParse handles the results. A call to itemBulkMediaInfoParse(...) gets itemData like:

["AF1QipMgwPVvPQ8m22vpUz6LQo_mZmpcWZI_wtK7JDkW",[null,null,"","PXL_20241005_005559925.RAW-01.COVER.jpg",null,null,1728089759925,46800000,1728123892468,4801397,100,null,null,null,null,null,null,null,2,null,null,null,null,1,["20241005_005559925",1,null,2],231702,null,null,null,null,null,null,null,null,[1,4801397,2,0,null,1]],null,null,null,["AF1QipMgwPVvPQ8m22vpUz6LQo_mZmpcWZI_wtK7JDkW",["105555929858285613830",null,"7422235602528031890"]]]

Note the filename "PXL_20241005_005559925.RAW-01.COVER.jpg"

So from my limited understanding of what's going on here, the media key points to the "main" image for a photo stack. This script doesn't seem to support photo stacks, and doesn't "understand" that there are other files in the same stack.

Looking at getBatchMediaInfo it's hard to understand quite what it does, and the purpose of the long array of nulls etc it passes.

From what I see so far the script doesn't seem to support media stacks, and further reverse-engineering of Photos' internal JavaScript/web-services API would be needed to handle media stacks.

The mediaKey is definitely the same as what's in the photo URL. But I have no idea how to debug the minified JavaScript from photos to figure out how Photos discovers whether a media item is a stack, and how it enumerates the items in the stack. This is the first time I've ever done interactive JavaScript debugging in a browser. So I'm pretty much stuck at this point.

It's unclear whether the script needs to use a different endpoint to request info on stacks, use different arguments to an existing endpoint, or what. Photos makes a truly amazing number of requests on page load for a photo stack, so picking it out of the request/responses doesn't look simple either.

@ringerc
Copy link

ringerc commented Nov 21, 2024

I tried using Firefox's network view, loading a media media key AF1QipONxXAAWtZIe49vPoYA2B3LTMTPit-aC1XgcVZo I know is a cover JPG to see where the known media-key for the associated DNG AF1QipOQz5xLng4T__9p-C1hsmx6QScVDIl03Lyk5he7 appears in the request/response stream. Firefox has a wonderful resources search feature for this.

The request POST URL with a response containing the DNG media key was:

https://photos.google.com/_/PhotosUi/data/batchexecute?rpcids=wgZjtc&source-path=%2Fphoto%2FAF1QipONxXAAWtZIe49vPoYA2B3LTMTPit-aC1XgcVZo&f.sid=-2763207787640693710&bl=boq_photosuiserver_20241120.02_p0&hl=en&soc-app=165&soc-platform=1&soc-device=1&_reqid=1236810&rt=c

So it's making an RPC call wgZjtc - which this userscript does not appear to presently know about or use. Maybe there's a different call for listing though, this call was when I loaded one specific image page.

Parameter, expanded, are

rpcids=wgZjtc
source-path=%2Fphoto%2FAF1QipONxXAAWtZIe49vPoYA2B3LTMTPit-aC1XgcVZo
f.sid=-2763207787640693710
bl=boq_photosuiserver_20241120.02_p0
hl=en
soc-app=165
soc-platform=1
soc-device=1
_reqid=1236810
rt=c

The POST data was

f.req=[[["wgZjtc","[\"20241003_033933062\",null,1,1]",null,"generic"]]]
at=AALKv19Ot1QY32E2ifOf76UFZIGV:1732223608464

so it's asking for details on an item "20241003_033933062" by the looks, presumably listing stack elements for it.

Request headers, with __Secure headers and cookies redacted, were:

POST /_/PhotosUi/data/batchexecute?rpcids=wgZjtc&source-path=%2Fphoto%2FAF1QipONxXAAWtZIe49vPoYA2B3LTMTPit-aC1XgcVZo&f.sid=-2763207787640693710&bl=boq_photosuiserver_20241120.02_p0&hl=en&soc-app=165&soc-platform=1&soc-device=1&_reqid=1236810&rt=c HTTP/2
Host: photos.google.com
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:132.0) Gecko/20100101 Firefox/132.0
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br, zstd
Content-Type: application/x-www-form-urlencoded;charset=utf-8
Content-Length: 168
Referer: https://photos.google.com/
X-Same-Domain: 1
x-goog-ext-353267353-jspb: [null,null,null,128907]
Origin: https://photos.google.com
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: same-origin
Connection: keep-alive
Cookie: REDACTED
OTZ=7830557_4_4_133320_8_385320
Sec-GPC: 1
TE: trailers

Response headers with __Secure and set-cookie redacted were:

HTTP/2 200 
content-type: application/json; charset=utf-8
vary: Sec-Fetch-Dest, Sec-Fetch-Mode, Sec-Fetch-Site
cache-control: no-cache, no-store, max-age=0, must-revalidate
pragma: no-cache
expires: Mon, 01 Jan 1990 00:00:00 GMT
date: Thu, 21 Nov 2024 21:13:31 GMT
content-disposition: attachment; filename="response.bin"; filename*=UTF-8''response.bin
x-content-type-options: nosniff
strict-transport-security: max-age=31536000
content-security-policy: require-trusted-types-for 'script';report-uri /_/PhotosUi/cspreport
accept-ch: Sec-CH-UA-Arch, Sec-CH-UA-Bitness, Sec-CH-UA-Full-Version, Sec-CH-UA-Full-Version-List, Sec-CH-UA-Model, Sec-CH-UA-WoW64, Sec-CH-UA-Form-Factors, Sec-CH-UA-Platform, Sec-CH-UA-Platform-Version
cross-origin-opener-policy: same-origin-allow-popups
permissions-policy: ch-ua-arch=*, ch-ua-bitness=*, ch-ua-full-version=*, ch-ua-full-version-list=*, ch-ua-model=*, ch-ua-wow64=*, ch-ua-form-factors=*, ch-ua-platform=*, ch-ua-platform-version=*
cross-origin-resource-policy: same-site
content-encoding: br
server: ESF
x-xss-protection: 0
x-frame-options: SAMEORIGIN
set-cookie: REDACTED
alt-svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000
X-Firefox-Spdy: h2

Response payload was

)]}'
1503
[["wrb.fr","wgZjtc","[null,null,[[\"AF1QipONxXAAWtZIe49vPoYA2B3LTMTPit-aC1XgcVZo\",[\"https://photos.fife.usercontent.google.com/pw/AP1GczOUs2DF9FPBr-jTnd5fxYyqVe436EWlxsAXuDL90-s0twwaI5C0qOAVPA\",4080,3072,null,null,null,null,null,[4080,3072,1,null,[\"Google\",\"Pixel 6\",null,2.35,2.2,55,4.63E-4,null,1]],[3553847],2,[[true,true]]],1727926773062,\"EQnFRQw2XMPOZJXLCTWFk4RNV8E\",46800000,1727942010502,null,[[1],[2],[3],[4],[8],[21],[14],[15],[17],[18],[19],[22],[28,false,true],[29,false,true],[35,false,true]],2,null,null,null,null,null,231700,{\"119286562\":[0,false,\"20241003_033933062\",null,[\"20241003_033933062\",1,null,2]],\"119908567\":[\"20241003_033933062\",true],\"318563170\":[[1,4269672,2,0,null,1]]}],[\"AF1QipOQz5xLng4T__9p-C1hsmx6QScVDIl03Lyk5he7\",[\"https://photos.fife.usercontent.google.com/pw/AP1GczOEcls9XFpGkh6qc8WtJ-W6w7hDRlQfb8EdnIEzBJ0dhW_LyI_4uZ6v_g\",3990,3000,null,null,null,null,null,[3990,3000,11,null,[\"Google\",\"Pixel 6\",null,2.35,2.2,52,4.6306197E-4,null,1]],[3553589]],1727926773216,\"gGdV7mBqgfKwM16Y8VzGHvKj33k\",46800000,1727942015094,null,[[1],[2],[3],[4],[8],[21],[14],[15],[17],[18],[19],[22],[28,false,true],[29,false,true],[35,false,true]],2,null,null,null,null,null,220280,{\"119286562\":[0,false,\"20241003_033933062\",null,[\"20241003_033933062\",null,null,2]],\"119908567\":[\"20241003_033933062\"],\"318563170\":[[1,14287608,2,0,null,1]]}]],\"20241003_033933062\"]",null,null,null,"generic"],["di",125],["af.httprm",125,"913100324897604153",40]]
26
[["e",4,null,null,1541]]

I've saved the whole request/response as "HAR" on disk, in case that's needed, but it's trivial to repro anyway.

(I'm aware this data reveals the timestamp these images were taken, their geolocation, etc, and I don't really care).

@ringerc
Copy link

ringerc commented Nov 21, 2024

I tried loading the main photo list view with network recording on, and scrolling down to the relevant image/period. Firefox's search doesn't seem to find the media key for the known DNG though, nor the media key for the associate cover JPEG. I think this might be a search issue rather than anything else as the search seems to hang indefinitely.

The thumb URL for the above cover-image AF1QipONxXAAWtZIe49vPoYA2B3LTMTPit-aC1XgcVZo in the list view is https://photos.fife.usercontent.google.com/pw/AP1GczOUs2DF9FPBr-jTnd5fxYyqVe436EWlxsAXuDL90-s0twwaI5C0qOAVPA=w286-h215-no?authuser=0. Page elements content:

<div class="rtIMgb nV0gYe e37Orb" jsname="NwW5ce" jscontroller="RcgMC" style="width: 286px; height: 215px; transition: none; transform: translate3d(578px, 1152px, 0px);" jslog="6959; track:click; 8:AF1QipONxXAAWtZIe49vPoYA2B3LTMTPit-aC1XgcVZo"><a class="p137Zd" tabindex="0" jsaction="click:eQuaEb;focus:AHmuwe; blur:O22p3e;" style="" href="./photo/AF1QipONxXAAWtZIe49vPoYA2B3LTMTPit-aC1XgcVZo" aria-label="Burst photo - Landscape - Oct 3, 2024, 4:39:33 PM - 2 photos in sequence"><div class="RY3tic" style="opacity: 1; background-image: url(&quot;https://photos.fife.usercontent.google.com/pw/AP1GczOUs2DF9FPBr-jTnd5fxYyqVe436EWlxsAXuDL90-s0twwaI5C0qOAVPA=w286-h215-no?authuser=0&quot;), url(&quot;https://photos.fife.usercontent.google.com/pw/AP1GczOUs2DF9FPBr-jTnd5fxYyqVe436EWlxsAXuDL90-s0twwaI5C0qOAVPA=w72-h54-k-no?authuser=0&quot;);" data-latest-bg="https://photos.fife.usercontent.google.com/pw/AP1GczOUs2DF9FPBr-jTnd5fxYyqVe436EWlxsAXuDL90-s0twwaI5C0qOAVPA=w286-h215-no?authuser=0"><div class="eGiHwc" aria-hidden="true"></div><div class="KYCEmd" aria-hidden="true"></div></div></a><div class="RmSd1b" aria-hidden="true" style=""></div><div role="checkbox" jsaction="mousedown:KamsZ; click:KamsZ; focus:AHmuwe; blur:O22p3e" aria-label="Burst photo - Landscape - Oct 3, 2024, 4:39:33 PM - 2 photos in sequence" aria-checked="false" class="QcpS9c ckGgle" tabindex="0" jslog="25457; track:click;"><svg width="24px" height="24px" class="v1262d kWbB0e" viewBox="0 0 24 24"><circle cx="12" cy="12" r="17"></circle></svg><svg width="24px" height="24px" class="v1262d rqet2b" viewBox="0 0 24 24"><path d="M12 2C6.48 2 2 6.48 2 12s4.48 10 10 10 10-4.48 10-10S17.52 2 12 2zm0 18c-4.42 0-8-3.58-8-8s3.58-8 8-8 8 3.58 8 8-3.58 8-8 8z"></path></svg><svg width="24px" height="24px" class="v1262d eoYPIb" viewBox="0 0 24 24"><circle cx="12" cy="12" r="8"></circle></svg><svg width="24px" height="24px" class="v1262d orgUxc" viewBox="0 0 24 24"><path d="M12 2C6.48 2 2 6.48 2 12s4.48 10 10 10 10-4.48 10-10S17.52 2 12 2zm-2 15l-5-5 1.41-1.41L10 14.17l7.59-7.59L19 8l-9 9z"></path></svg></div><div class="Tee6gf"></div><div class="GzIbP"><div class="P8pGvd"><svg width="24px" height="24px" class="v1262d ktdIWe" viewBox="0 0 24 24" aria-hidden="true"><path d="M4.5 13h1.1l.9 2H8l-.9-2.1c.5-.3.9-.8.9-1.4v-1C8 9.7 7.3 9 6.5 9H3v6h1.5v-2zm0-2.5h2v1h-2v-1zM16 15h1.5l.5-3 .5 3h2l1-6H20l-.5 3-.5-3h-2l-.5 3-.5-3h-1.5l1 6zm-5.12-1.5h1.25l.37 1.5H14l-1.5-6h-2L9 15h1.5l.38-1.5zm.87-1.5h-.5l.25-1 .25 1z"></path></svg></div><div class="bmpxFe"><div class="Q3N5Kd"></div><svg width="24px" height="24px" class="v1262d hcdKNb" jsaction="click:eQuaEb;focus:AHmuwe; blur:O22p3e" aria-label="Open" viewBox="0 0 24 24" title="Open" role="button" tabindex="0" jslog="38279; track:click"><path d="M15.5 14h-.79l-.28-.27A6.471 6.471 0 0 0 16 9.5 6.5 6.5 0 1 0 9.5 16c1.61 0 3.09-.59 4.23-1.57l.27.28v.79l5 4.99L20.49 19l-4.99-5zm-6 0C7.01 14 5 11.99 5 9.5S7.01 5 9.5 5 14 7.01 14 9.5 11.99 14 9.5 14zm1-7.5h-2v2h-2v2h2v2h2v-2h2v-2h-2z"></path></svg></div></div></div>

I eventually dug the request/response out of the network traffic.

It was a request

https://photos.google.com/_/PhotosUi/data/batchexecute?rpcids=lcxiM&source-path=%2F&f.sid=1274596181689105744&bl=boq_photosuiserver_20241120.02_p0&hl=en&soc-app=165&soc-platform=1&soc-device=1&_reqid=1940959&rt=c

with decoded POST data

[
              {
                "name": "f.req",
                "value": "[[[\"lcxiM\",\"[null,1728108139999,null,null,1,1,1727902885000]\",null,\"generic\"]]]"
              },
              {
                "name": "at",
                "value": "AALKv1_QE3vr9T5TfZTkgaF3hxew:1732227757021"
              },
              {
                "name": "",
                "value": ""
              }
            ]

(POST text "f.req=%5B%5B%5B%22lcxiM%22%2C%22%5Bnull%2C1728108139999%2Cnull%2Cnull%2C1%2C1%2C1727902885000%5D%22%2Cnull%2C%22generic%22%5D%5D%5D&at=AALKv1_QE3vr9T5TfZTkgaF3hxew%3A1732227757021&")

and a strange response format consistent with these other API requests, that contains text with some escaped json and other non-json content like the RPC ID lcxiM etc.

The response item for this specific image, after decoding and unescaping from .response.content.text, was:

["AF1QipONxXAAWtZIe49vPoYA2B3LTMTPit-aC1XgcVZo",["https://photos.fife.usercontent.google.com/pw/AP1GczOUs2DF9FPBr-jTnd5fxYyqVe436EWlxsAXuDL90-s0twwaI5C0qOAVPA",4080,3072,null,null,null,null,null,[null,null,1],[3553847]],1727926773062,"EQnFRQw2XMPOZJXLCTWFk4RNV8E",46800000,1727942010502,null,[[1],[2],[3],[4],[8],[21],[14],[15],[17],[18],[19],[22],[28,false,true],[29,false,true],[35,false,true]],2,null,null,null,null,null,231700,{"119286562":[2,false,"20241003_033933062",null,["20241003_033933062",1,[2,1],2]],"129168200":[[null,null,null,null,null,2],[[-408665194,1754942000],null,null,false,[[null,[["Kaituna, Wellington",null,1,false,false]],"0x6d40de1be6915bc1:0x500ef6143a2b180"]],3]],"163238866":[false]}]

This has the media-key, the thumbnail URI, and ... other stuff. There are some "2"s in there that might indicate a stack-size, but it is hard to tell for sure.

So far I haven't been able to figure out how to go from the listed media key for the cover image to the media key and filename for the DNG.

I saved a bunch of .har records of network traffic etc.

I've hit my limit for trying to decode this, as I don't really understand how this API works or how the fields are interpreted, and the script doesn't really explain the fields. Hopefully the data collected helps someone else.

I might see if I can do this with the photos application API instead ... before they shut it down in March next year, by restricting it to only photos managed by that app. Sigh. Thanks Google, user friendly as always.

@xob0t xob0t added the enhancement New feature or request label Dec 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants