Spatie Media Library url cannot be reached
Recently ran into an issue in which I was getting the following error during a scraping project.
Spatie\MediaLibrary\MediaCollections\Exceptions\UnreachableUrl
Url `https://somedomain.com/someimagefile.jpg` cannot be reached
After further testing, it appears that the site was blocking direct access during scraping:
Failed to open stream: HTTP request failed! HTTP/1.1 403 Forbidden
Luckily the Spatie Media Package allows you to swap the downloader utilized to download external urls.
Create file
app/Support/MediaLibrary/Downloaders/CustomDownloader.php
<?php
namespace App\Support\MediaLibrary\Downloaders;
use Spatie\MediaLibrary\Downloaders\Downloader;
class CustomDownloader implements Downloader
{
public function getTempFile(string $url): string
{
$context = stream_context_create([
'http' => [
'header' => "User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36",
],
]);
$stream = file_get_contents($url, false, $context);
$temporaryFile = tempnam(sys_get_temp_dir(), 'media-library');
file_put_contents($temporaryFile, $stream);
return $temporaryFile;
}
}
In file: config/media-library.php
change the media_downloader
key to:
'media_downloader' => \App\Support\MediaLibrary\Downloaders\CustomDownloader::class,
This will help downloading from sources preventing you from downloading their images. Depending on the source you may need to tweak the headers more and find a header that does allow you to download from there, ie add Google bot headers.