Real insights in real time

Learn, share, and connect around europe dataset solutions.
Post Reply
kexej28769@nongnue
Posts: 227
Joined: Tue Jan 07, 2025 4:41 am

Real insights in real time

Post by kexej28769@nongnue »

The 8 duplicate issues actually represent 18 pages, and the table returns all 18 affected pages. In some cases, the duplicates will be obvious from the title and/or URL, but in this case there's a bit of a mystery, so let's pull up this export file. In this case, there's a column called "Duplicate Content Group" and sorting through it reveals something like the following (the original export file has a lot more data)...



I’ve renamed the “Duplicate Content Group” to tunisia number data “Group” and added a word count (“words”), which can be useful for verifying true duplication. Take a look at group number 7 – it turns out that these “Weekly Menu Plan” pages are heavily image-heavy and have a common block of text before any unique text. While not 100% duplicate, these otherwise valuable pages easily look like thin content to Google and represent a broader problem.


Not counting the time spent writing the blog post, it took less than an hour to run this crawl and dive in, and even that short time uncovered more potential issues than I can cover in this post. In less than an hour, you can go into a client meeting or sales call with in-depth knowledge of any domain.
Post Reply