generated from amazon-archives/__template_DevGuide
-
Notifications
You must be signed in to change notification settings - Fork 30
Open
Description
Hi there — I am seeing a large number of missing XML files, and wondering whether it is something on my fault or something in the S3 instance. Of 300 objects tested, 80% (241) threw a 404 error:
In download.file(url, file_path, quiet = TRUE)
cannot open URL 'https://s3.amazonaws.com/irs-form-990/201723259349300000_public.xml': HTTP status was '404 Not Found'
When I try to access one of these files through the browser, I see this message:
NoSuchKey
The specified key does not exist.
201700379349300224_public.xml
CFAMJNK6P9V903JZ
TkCHSmCi+R7gHMceu2ZZp3jIrEGWT3IQlmAXn28iLU7S0pfltJ/Bvz+TZeGhhhBkEOT95EI3BQA=
3 sample keys that are in the index and return XML
- https://s3.amazonaws.com/irs-form-990/201743189349310624_public.xml
- https://s3.amazonaws.com/irs-form-990/201743189349310944_public.xml
- https://s3.amazonaws.com/irs-form-990/201703149349302400_public.xml
5 sample keys that are in the index but return 404
- https://s3.amazonaws.com/irs-form-990/201703189349302176_public.xml
- https://s3.amazonaws.com/irs-form-990/201703149349302624_public.xml
- https://s3.amazonaws.com/irs-form-990/201743199349311168_public.xml
For context, I tried a similar exercise about 2 years ago and anecdotally remember the occasional missing XML file, but not close to 80%.
Any ideas about what is going on? Thanks!
Metadata
Metadata
Assignees
Labels
No labels