I have a wordpress site that has thousands of image files. Problem is, vast majority are redundant and just using up disk space. I need a way to know which ones are actually referenced by the html so that I can delete those that aren't.
Maybe Selenium WebDiriver could help? I could scrape the website to get the value of src attributes of all img elements.
Using the following code, the images collection is populated with 22 items - which is correct for the particular page. Problem is, I don't know how to get to the value of the "src" attribute?
var images = driver.FindElements(By.TagName("img"));
foreach (var image in images)
{
Debug.WriteLine(image.Text);
}