Spatie Media Library url cannot be reached
Recently ran into an issue in which I was getting the following error during a scraping project.
Copied!
1Spatie\MediaLibrary\MediaCollections\Exceptions\UnreachableUrl23 Url `https://somedomain.com/someimagefile.jpg` cannot be reached
After further testing, it appears that the site was blocking direct access during scraping:
Copied!
1Failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden
Luckily the Spatie Media Package allows you to swap the downloader utilized to download external urls.
Create file
app/Support/MediaLibrary/Downloaders/CustomDownloader.php
Copied!
1<?php 2 3namespace App\Support\MediaLibrary\Downloaders; 4 5use Spatie\MediaLibrary\Downloaders\Downloader; 6 7class CustomDownloader implements Downloader 8{ 9 public function getTempFile(string $url): string10 {11 $context = stream_context_create([12 'http' => [13 'header' => "User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36",14 ],15 ]);1617 $stream = file_get_contents($url, false, $context);1819 $temporaryFile = tempnam(sys_get_temp_dir(), 'media-library');2021 file_put_contents($temporaryFile, $stream);2223 return $temporaryFile;24 }25}
In file: config/media-library.php
change the media_downloader
key to:
Copied!
1'media_downloader' => \App\Support\MediaLibrary\Downloaders\CustomDownloader::class,
This will help downloading from sources preventing you from downloading their images. Depending on the source you may need to tweak the headers more and find a header that does allow you to download from there, ie add Google bot headers.