This package provides an easy way to parse RSS feeds and save them into your application. It offers features like fetching the entire content of an RSS feed, saving images found in the feed items, getting the full content of each item in the feed, and extracting clean text without ads, donation forms, or other unwanted content.
- Parses multiple RSS feeds.
- Saves images in the RSS feed items to a storage location.
- Retrieves the full content of each item in the RSS feed.
- Extracts clean text content - Removes donation forms, ads, social sharing buttons, and other unwanted elements.
- Supports Spatie Media Library for storing images.
- Configurable content selectors per domain.
- Automatic removal of unwanted HTML elements before content extraction.
- PHP 7.4 or higher
- Laravel 5.5 or higher
- SimplePie PHP library 1.8 or higher
- Optional: Spatie Media Library (if enabled for image storage)
You can install this package via Composer using:
composer require kalimeromk/rssfeedThis package uses Laravel's auto-discovery feature, so you don't need to register the service provider.
This package supports optional configuration.
You can publish the configuration file using:
php artisan vendor:publish --provider="Kalimeromk\Rssfeed\RssfeedServiceProvider" --tag="config"This will publish a rssfeed.php config file to your config directory. Here you can set various options for image storage, HTTP behavior, and content extraction.
return [
// Storage and Spatie settings
'image_storage_path' => 'images',
'spatie_media_type' => 'image',
'spatie_disk' => 'public',
'spatie_enabled' => false,
// HTTP options
'http_verify_ssl' => true,
'http_timeout' => 15,
'http_retry_times' => 2,
'http_retry_sleep_ms' => 200,
// Content extraction selectors per domain
'content_selectors' => [
// 'example.com' => '//article',
],
// Default selector for content extraction
'default_selector' => '//article | //div[contains(@class, "entry-content")] | ...',
// Selectors for elements to remove (ads, donations, etc.)
'remove_selectors' => [
'.donation-form', '.donate-box', '.share-buttons',
'.comments', '.ad', '.sidebar', // ... and more
],
];image_storage_path: Specifies the path where images from RSS feed items should be stored (if not using Spatie Media Library).spatie_media_type: Defines the media collection type when using Spatie Media Library.spatie_disk: Specifies which Laravel storage disk to use.spatie_enabled: Set totrueif you want to store images using Spatie Media Library.default_selector: The default selector to use when extracting the full content of an RSS feed item.content_selectors: Here you can map specific domains to custom XPath selectors for fetching full content from a post. If the post URL belongs to one of these domains, its selector will be used.remove_selectors: CSS selectors for elements to remove before extracting content (donation forms, ads, social sharing, comments, etc.).
Below are examples of how to use this package.
use Kalimeromk\Rssfeed\RssFeed;
$rss = app(RssFeed::class);
$items = $rss->parseRssFeeds('https://example.com/feed/');
foreach ($items as $item) {
// $item is an array with keys: title, description, permalink, link, copyright,
// author, language, content, categories, date, enclosure, images, image
echo $item['title'];
echo $item['content']; // Full HTML content
}use Kalimeromk\Rssfeed\RssFeed;
$rss = app(RssFeed::class);
$items = $rss->parseRssFeedsClean('https://example.com/feed/');
foreach ($items as $item) {
echo $item['title'];
echo $item['content']; // Clean text without HTML, ads, donation forms
echo $item['description']; // Also cleaned up
}The parseRssFeedsClean() method automatically:
- Removes donation forms and payment sections
- Removes social sharing buttons
- Removes advertisements
- Removes comments sections
- Extracts plain text from HTML
- Removes common donation text patterns
use Kalimeromk\Rssfeed\RssFeed;
$rss = app(RssFeed::class);
// Get full HTML content
$htmlContent = $rss->fetchFullContentFromPost('https://example.com/article/123');
// Get clean text content (recommended)
$cleanText = $rss->fetchCleanTextFromPost('https://example.com/article/123');use Kalimeromk\Rssfeed\RssFeed;
$rss = app(RssFeed::class);
$feed = $rss->RssFeeds('https://example.com/feed/');
$title = $feed->get_title();
foreach ($feed->get_items() as $item) {
// ... use SimplePie\Item API
}You can save images found in the RSS feed items using the saveImagesToStorage method. This method accepts an array of image URLs and returns an array of saved image names. If Spatie Media Library is enabled and a model is provided, media will be attached to the model's collection.
You can also extract image URLs directly from a SimplePie item using:
$images = $rss->extractImagesFromItem($item);
$primaryImage = $images[0] ?? null;$images = [
'http://example.com/image1.jpg',
'http://example.com/image2.jpg',
];
$rss = app(\Kalimeromk\Rssfeed\RssFeed::class);
$savedImageNames = $rss->saveImagesToStorage($images);If you have Spatie Media Library enabled and you want to save images to a media collection:
use App\Models\Post;
$rss = app(\Kalimeromk\Rssfeed\RssFeed::class);
$post = Post::find(1); // Model should support addMediaFromUrl (Spatie Media Library)
$images = [
'http://example.com/image1.jpg',
'http://example.com/image2.jpg',
];
$savedImageNames = $rss->saveImagesToStorage($images, $post);use Illuminate\Database\Eloquent\Model;
use Spatie\MediaLibrary\HasMedia;
use Spatie\MediaLibrary\InteractsWithMedia;
class Post extends Model implements HasMedia
{
use InteractsWithMedia;
}If you need to extract content from specific websites with unique HTML structure:
// config/rssfeed.php
'content_selectors' => [
'example.com' => '//div[@class="article-body"]',
'news.site.com' => '//article[contains(@class, "main-content")]',
],Add your own selectors for elements to remove:
// config/rssfeed.php
'remove_selectors' => [
// Default selectors...
// Your custom selectors
'.custom-ad-banner',
'#newsletter-signup',
'.site-specific-donation',
],This package does not ship with a built-in Job class. If you need queueing, create a Laravel Job and inject the RssFeed service inside it.
This package was created by KalimeroMK.
The MIT License (MIT). Please see License File for more information.