Developer blog Archive • Yoast

Testing the new Yoast SEO settings interface

Oksana Abakumova — Tue, 21 Feb 2023 13:59:51 +0000

As you may already know, we recently released a brand new version of the Yoast SEO plugin with a new settings user interface. We covered the topic multiple times in different blog posts including one where we provided the thought process behind the redesign in general and, more specifically, why now. In this article, I as a QA Lead at Yoast, want to highlight all the various steps we took to ensure the quality of the new Settings UI.

This redesign not only looks impressive but also required a massive amount of collaboration between different subdivisions within the Yoast company. This was truly a team effort in many aspects – because we clearly understood the effect it will have on the user experience of Yoast SEO users.

It was utterly important for us to deliver a new vision for our settings as smoothly as possible. That kind of change was a little bit scary for us but we trust each member of our team – that everyone will do their best to deliver the best quality in each respective area: be it development, content, videos, support, etc.

Agile approach

When you rewrite a big part of a plugin, there are several approaches you can take:

Starting from scratch and completely removing the old parts.
Iterative improvements with both parts (new and old) being present in the plugin simultaneously.

Both approaches have their own pros and cons but, as the saying goes, we decided to eat the elephant one bite at a time. We started with a single page redesign that lived inside the plugin under a feature flag YOAST_SEO_NEW_SETTINGS_UI.

Now, what is the best way to test drastic changes in the plugin? For team Yoast, it’s deploying everything to a production environment with tons of everything: users, custom post types, taxonomies, etc. So we were dogfooding (meaning: using your own product to test) from the very beginning by using and testing everything on this same site – yoast.com. That’s what we always do with each and every release.

This approach allowed us to work on the redesign for months while still delivering new features and quality improvements, as well as bug fixes, to our users.

As part of our agile approach, we also shared an early alpha version of the new Settings UI with Yoast partners and various agencies. For anyone who was interested to give it a try, it was also shared on our developer blog. This gave them almost 2 months to update their documentation, videos, screenshots, etc. and make sure they are ready, compatible, and familiar with the big changes.

Company-wide acceptance testing

For final acceptance testing before the Yoast SEO 20.0 release, we decided to engage people from all departments, so everyone’s knowledge and experience could help ensure that we deliver the highest quality product.

The process started in December 2022, a month before the release day on January 24th, 2023. All the areas that require manual testing were converted into individual GitHub issues, and different team members or the whole departments have been assigned to one or several such tasks. So, all in all:

30+ people were actively involved in this final round of testing.
Almost 70 test tasks were distributed across the involved people.
With everyone involved, almost the full month (150+ hours) was spent on testing.

We paid attention to a lot of areas which I will outline below.

Functional testing

We wanted to make sure that there is no SEO impact for upgrading to v20. To achieve this we checked that all the options conform to our specifications. But not only that, backward compatibility has the utmost priority for us, so we were determined to make sure our users upgrading from the previous versions of the Yoast SEO plugin do not lose any settings as well.

Every single option and its combinations have been tested and re-tested to work properly in the new Settings UI. That’s why the testing required a team-wide effort for a prolonged period of time.

User experience (UX)

To make sure our changes were as good as possible, we had several rounds of UX review by different teams, including UX and support teams. The support team review was especially important because they deal with Yoast SEO users on a daily basis and they know their pain points, and can predict potential confusion. But in general, almost everyone on the Yoast team dedicated some time to test the user experience.

Obviously, the UX review was performed during the initial development phase, and also as a final check during the acceptance testing.

Accessibility

At Yoast, we always try to be as inclusive as possible. That includes the need to make our updated Settings pages accessible. So we asked our accessibility specialist Andrea Fercia (who is also part of the WordPress core Accessibility team) to review our updated Settings UI, and thanks to his feedback we introduced some very important improvements in this area.

Performance & security testing

Being installed on more than 13 million sites means that we should triple-check our changes to avoid any performance and/or security issues. Affecting even 1% of our customers results in more than 130 thousand sites potentially being insecure or working with degraded performance.

That’s why we have a dedicated team that helps the development team be mindful of all the changes. They tested and reviewed the new code and provided suggestions and ideas that were promptly implemented.

Multi-language support testing

The huge variety of sites in different parts of the world that use our product requires us to make sure that every big change is still working properly in languages other than WordPress’ default English. With the help of our dedicated linguistics team, we tested the plugin in a lot of different languages to make sure that all the strings are displayed properly in places where they should be, that search is working across different languages, etc.

Automated tests

Here at Yoast, we believe that automating repetitive tasks is the correct way to simplify our lives and free up additional time for creative work. That’s why we have an extensive list of automated tests that we use to check on a daily basis that major parts of our plugin user interface are working as expected. You can read more about our processes and reasoning in the How to automate repetitive daily tests post by my colleague Daria.

While working on the new Settings UI, we adjusted our test cases to be compatible with the latest changes. This way we had an old set of tests running for Yoast SEO 19.x versions of the plugin, and a new set created specifically for the redesign and planned to be released in 20.0 – all running and used at the same time. Better safe than sorry.

Post-release support and monitoring

Despite all of our efforts and energy put into developing and testing the new Settings UI, there is always the possibility of a compatibility issue or other issues due to the number of sites using our plugin. There are literally millions of variations of plugins, themes, custom code snippets, etc. all potentially affecting the way the Yoast SEO plugin works in each individual case.

That’s why right after any release we closely monitor any issues being mentioned across various forums and social networks to catch any potential problems that our users face. We are prepared to promptly react and support anyone having complications. We hope you enjoy the new Yoast SEO settings interface and if you do encounter any bugs, make sure to let our support team know!

The post Testing the new Yoast SEO settings interface appeared first on Yoast.

How we built the inclusive language analysis in Yoast SEO

Camille Cunningham — Wed, 04 Jan 2023 12:25:35 +0000

Yoast SEO now comes with a new analysis to make your content even more accessible to all: the inclusive language analysis. This analysis helps you write more inclusively, lowering the likelihood that you’ll exclude someone from your content. That means you can reach a wider audience. But how does this analysis work? And how was it developed?

Before we get started, let’s provide you with a proper definition of what inclusive language is. At its core, inclusive language helps identify and avoid terms that could exclude marginalized groups of people. Typically, these are terms that perpetuate prejudice, stigma, or erasure. More inclusive language prefers alternatives that are less likely to be experienced as harmful or exclusionary. Want to learn more about inclusive language, then check out this blog post on what is inclusive language?

Let’s meet the people involved

Most of the work on the inclusive language feature was done by a team of three people – two Yoast linguists/developers with a passion for inclusive language, and an external scientific advisor, Maxwell Hope. Maxwell is a Ph.D. student at the University of Delaware and his research focuses on the linguistic practices of non-binary people. For example, his research topics include the use of non-binary pronouns and gender perception in speech. They also have a deep interest in inclusive language and a lot of knowledge on this topic.

It all started with lots of research

Our first step was to compile a list of non-inclusive terms that our feature could highlight, and their inclusive alternatives. Fortunately, there are many sources available online that we could consult to help us create such a list. For example, the APA Style guide provides a lot of advice on how to write more inclusively. There are also many guides created by activists and community members. For example, the disability activist and writer Lydia X. Z. Brown created this incredibly helpful guide to avoiding ableist language.

However, compiling the list was not as simple as copy-pasting the terms we found in the guides. We took a critical approach and made our own changes, additions, and deletions. One of our guiding principles was that we always wanted to center the voices of people who belong to communities directly affected by the specific language. We used our own experience of belonging to certain communities, and/or did follow-up research, to ensure this.

For example, many inclusive language guides advise using person-first language (such as a person with a disability) instead of identity-first language (such as a disabled person) when talking about disability. However, there is a huge number of disabled people who actually prefer identity-first language. A lot of them are really against the advice of person-first language being better than identity-first language. For example, in an article titled Why Person-First Language Doesn’t Always Put the Person First, disability rights activist Emily Landau argues that this advice is actually rooted in the stigma against disabilities. Needless to say, we didn’t want to include identity-first terms such as disabled person on our list of non-inclusive terms.

The next step was the implementation

The challenge of context-dependence

Once we had a list of non-inclusive terms and their alternatives, we moved on to our next challenge. We needed to make sure that the inclusive language check only asks you to use an alternative if you’re actually using non-inclusive language. If we just searched for the terms from our list in your text, and always asked you to replace them, that would lead to a lot of inaccurate feedback. That’s because language is highly context-dependent. There are many terms that are okay to use in certain contexts but become non-inclusive in others.

Some examples:

The term First World is not inclusive when referring to a country or region. But if you’re talking about the First World War, the words First World change their meaning and become completely unproblematic.
The word guru is not inclusive when used as a general synonym for expert or mentor. But if you’re referring to an “actual” guru – a spiritual guide in religions such as Buddhism or Hinduism – it is perfectly appropriate.
The pronouns he, his, him, and himself are not inclusive when referring to people in general (for example, “Everyone has his own preferences)”. But of course, you can (and should!) use these pronouns when talking about a specific person who uses these pronouns.

For a human, it’s often clear at a glance whether a term is inclusive or not in a given context. But it can be a lot harder for a machine. And that was one of the challenges we were facing.

Solutions to address context-dependence

In the end, we came up with two different solutions for addressing context dependence:

Some terms, such as First World, only become non-inclusive when followed or preceded by specific words (such as War). In those cases, adding a simple rule to our algorithm was fortunately enough. So for example, we tell our algorithm to not target First World when followed by War.
In the case of context-dependent terms for which adding such a rule is not possible, we had a different strategy. We always target those terms, but you will never get a red light when you use them. Instead, you will see an orange light and a feedback string that explains in which context the term is inclusive. For example, if you use the word guru, the feedback will say: “Be careful when using guru as it is potentially harmful. Consider using an alternative, such as mentor, doyen, coach, mastermind, virtuoso instead, unless you are referring to the culture in which this term originated.”

So, if you get this feedback and you are using the word guru to refer to the culture in which this term originated, you don’t have to do anything (remember, it is not necessary to make all your light green!).

Remaining challenges

With these strategies, we manage to target a lot of non-inclusive terms. But there are still some remaining challenges. For example, the pronouns he, his, him, and himself are very common words, and they are most often used in an inclusive way (talking about a specific person who uses these pronouns). Sadly, there is no simple rule that a machine could use to tell apart the inclusive and non-inclusive uses. And while we could always target these pronouns with an orange light, this would lead to a lot of cases when people get feedback only to find out that they can ignore it. We thought this would be an annoying experience, and so for now, we don’t target these pronouns at all. Maybe it’s something our team will find a nice solution for in the future, though!

Try out the analysis in Yoast SEO

The inclusive language analysis is available in both Yoast SEO free and Premium. If you’re a user of our plugin, make sure to activate this analysis and give it a try! Here’s a preview of what it will look like while working on your content:

You can activate it by going to Yoast SEO > General > Features and toggling the inclusive language analysis switch. If you’re not using Yoast SEO currently, but do want to give our inclusive language analysis a try, get the Yoast SEO plugin now.

The post How we built the inclusive language analysis in Yoast SEO appeared first on Yoast.

Redesigning the Yoast SEO settings interface

Luc Kickken — Thu, 01 Dec 2022 11:59:05 +0000

We have redesigned the Yoast SEO settings interface. Today we’re “releasing” an early alpha of this new interface so you can see what it looks like. We’ve updated, overhauled, and comprehensively improved how our interfaces look, feel, and behave. We’ve also moved some settings and features to be more intuitive, and to better match your workflows.

A screenshot of the new Yoast SEO settings interface

We’re proud that we’ve always had the best functionality, but in recent years parts of our interface started to look dated. In that regard we might be a teeny, tiny bit late to the party. However, we now think we’re back at the front of the pack. Let me explain why we did what we did, and why we did it now…

First things first: download the alpha

If you want to play with this alpha, download it now:

Download Yoast SEO’s early settings alpha

All the usual caveats of Alpha release software applies: do not use this in production. Anywhere. It is not safe.

Release date: January 24th

We’ll release this to the world on January 24th. At least, that’s the plan. At that point the old interface will no longer be available.

We focused on the utility belt, but not the armor

As Yoast SEO, we’ve always strived to set the industry standard. Assisting over 13 million users in the complex world of SEO. A world that is in constant movement and always rapidly changing. Being part of such an ecosystem (and the WordPress ecosystem on top of that), requires a high degree of flexibility and – by extension – a matching allocation of resources. For that reason, we, historically, focused our efforts on the utility belt, rather than the suit of armor, so to speak. On features that would help our users rank better, rather than the user experience. For the experience we relied on WordPress, by using their layouts, controls and navigational systems.

Staying true to the WP experience

It’s no secret that we love WordPress and its users. We’ve always wanted to stay as true to the native WordPress experience as we could. At the same time we also felt like we wanted to be Yoast. Our branding is rather outspoken and we’re proud of that. Bringing our brand to Yoast SEO (and WordPress) however always proved a bit of a struggle. To what extent can you be you, while being part of something else?

At the same time we’ve growingly felt like we started to outgrow WordPress. Our product team was itching to take our experience to the next level. WordPress’ interface was holding us back a bit in that sense, as the admin interface outside of Gutenberg hasn’t really progressed for years. It felt like we were starting to collapse under the weight of our own utility belt.

The jump to Shopify and scaling to other platforms

The opportunity of releasing Yoast SEO for Shopify enhanced our desire – and the need – to start doing our ‘own’ thing rapidly. Fully integrating (in terms of look and feel) into another platform, just didn’t seem possible, nor desirable anymore. So, besides wanting to adopt our own style more and more, there now was also a business rationale behind it: designing in different styles, maintaining multiple design systems and code bases, et cetera, just doesn’t scale.

Looking competitive again

We’re not blind to what our competitors are doing. It is inspiring to see what other SEO plugins, tools and platforms are delivering in terms of user experience. The honest story is that it made us want to look competitive again even more. We wanted to give our users the experience that we feel they deserve. And make you stick with us not just for the fact that we’ve got the best features and know the most about SEO, but because of the experience too.

New is always better scary

New is always better. Except with whisky. All kidding aside: New can be scary. While we are proud of the direction that we’ve taken Yoast SEO. We’re not blind to the fact that a new experience requires time to adapt to. That’s why we want to know what you – the user – thinks. What’s missing? What could be improved? Which of your use cases and workflows haven’t we considered? Let us know, and we’ll keep iterating.

Our long and winding road forward

We’ve worked long and hard on this. But we’re not there yet. We’ve focused on our settings for now. But that’s not all that Yoast SEO is. So there is still much more to come. For now, we invite you to try on our new suit of armor. We hope it fits and starts to feel comfortable in time. And that you’ll start to love it as much as we do.

The post Redesigning the Yoast SEO settings interface appeared first on Yoast.

Notice: increasing the minimum PHP requirement for Yoast SEO

Joost de Valk — Tue, 08 Nov 2022 14:10:18 +0000

In March 2023, we will be increasing the minimum PHP requirement for Yoast SEO, Yoast SEO Premium, and all its add-ons to PHP 7.2. If you’re not running PHP 7.2 or higher yet, you really, really should! It will make your site faster and safer. Not to mention, your site will use less energy. That’s why I started writing about and nudging people to update to PHP 7 in our plugin six years ago (see the original post).

Background information

WordPress statistics show that about 12.1% of all WordPress installations are on PHP versions lower than 7.2. However, our own statistics – gathered through our plugins – show that only 1.86% of sites of our users (who share data with us) are on a low version of PHP. At this point, it’s simply no longer fair to give 98.14% of our customers a plugin that’s less good than what it could be.

Switching the minimum requirement to PHP 7.2 will allow us to write much better code, as the language has evolved quite a bit since PHP 5.6 was released. In addition, dropping PHP 5.6, 7.0, and 7.1 will make it a lot easier for us to get ready for PHP 9.

Help! I don’t know what to do!!

That’s fair, not everyone will know how to update their site’s PHP version. We have an article on how to update PHP that links to a lot of resources for many of the bigger hosts. This should help you get started. If your host isn’t listed, we’ll provide you with an example email.

I’m super excited to see that we’ve come to this point and I want to thank all of you who have updated your site.

The post Notice: increasing the minimum PHP requirement for Yoast SEO appeared first on Yoast.

Why we don’t set the og:image:alt tag

Andrea Fercia — Wed, 22 Jun 2022 13:43:33 +0000

Users often ask us why we don’t include a og:image:alt tag in the SEO and social tags that we add to pages. Alt attributes are good for accessibility, so sharing them with Facebook must be a good thing, right? Well, it turns out that it’s not that simple! We actually removed support for the og:image:alt tag in 2019, because we think that it can harm accessibility. Read on to find out why.

Challenges with alt attributes

Alternative text is important for accessibility and to help search engines find out what your content is about. Read more about the importance of the image alternative text. You may also want to learn how to optimize your alternative text for better accessibility and SEO.

Contrary to popular understanding, the alternative text should not describe the image. Rather, it should describe the image’s purpose, which varies depending on usage and context.

The World Wide Web Consortium (W3C) distinguish a few different types of image purpose: Informative images, Functional images, Decorative images, etc. They also provide an Alt decision tree for some quick help on deciding which category a particular image fits into. As a good rule of thumb: Images that are purely decorative don’t need an alternative text. Informative and functional images do.

One image, many alternative texts

The root problem with WordPress (and most similar systems) is that it only allows you to set a single default alternative text. Images often need many different alternative texts, depending on their purpose and context. Luckily, it’s possible to micro-manage the alternative text and change it in the post content, as the default one won’t fit all the use cases. Let’s go through a few examples to better understand this limitation.

We upload the following image to the WordPress media library. The image represents some pizza with pepperoni (note: it’s the Italian version of Pizza with pepperoni):

We set the image alternative text to:

Pizza with pepperoni

The alternative text describes the image, right? We’re all set up then! Wait a moment: Would that alternative text be a good one though? In most cases, the answer is: No. We didn’t take into consideration the actual usage of the image and its context.

Decorative purpose

Let’s say we’re going to use our Pizza image in a post where the image doesn’t add any meaningful information or function to the post content. The image is purely decorative. In this case, we should change the default alt attribute in the post content and make it empty:

Informative purpose

This time we’re going to use our image in a post about how to make Pizza with pepperoni. A pizza recipe! At some point in our post, we describe in detail how to distribute the red, yellow, and green pieces of pepperoni on top of our pizza. We add our image to illustrate this step of the recipe. Would the default alternative text ‘Pizza with pepperoni‘ be meaningful in this context? No, it wouldn’t. Luckily, we can change the alt attribute in the post content and make it more helpful based on the context:

Functional purpose

For our last example. we’re going to use our image in a blog post about our favorite Pizza place. At some point, we use our image as the only content for a link to our favorite pizza place website. Thus, the default alternative text ‘Pizza with pepperoni’ would actually be the link text. That would be wrong for both accessibility and SEO. Instead, the alternative text should describe the functionality of the link. It should describe the link destination. Luckily, we can change it in the post content:

How does the og:image:alt tag work?

The og:image:alt is an Open Graph tag meant to accompany the og:image tag. According to the specification, it should provide ‘A description of what is in the image‘. However, the principles illustrated above should apply to the og:image:alt as well.

For example, in the context of a linked image in a social share, describing what is in the image is incorrect. A linked image alternative text should describe the link destination, not the visual content of the image.

Why we removed the og:image:alt tag in Yoast SEO

When an image is actually used by Facebook (or other platforms), we know that it’ll usually be used as a linked image in a social share. Therefore, the image alternative text should describe the link destination. It should use the linked post title. Setting the alternative text to the value of an og:image:alt tag that describes the image content wouldn’t be appropriate in this case and might confuse users relying on assistive devices. Should the og:image:alt tag value be set to the post title then? That doesn’t seem correct. At this point, the actual purpose of the og:image:alt seems a bit questionable, at least in the context of a linked image in a social share.

That’s the reason why in Yoast SEO 10.1 (released in April 2019) we decided to remove the ability to set the og:image:alt tag. It fails to address a problem that should be solved upstream. At Yoast, we do think that managing the og:image:alt tag is more of a job for Facebook (and other platforms that uses it). They’re the only ones who can provide meaningful alternative text for their content:

They know how the og:image will be used.
They know in which context the og:image will be used.
They know the title of the original post the og:image belongs to.
They can provide a meaningful alternative text when the og:image has a functional purpose, for example when it’s a linked image.
They can auto-generate alternative text to describe the og:image content, when the image has an informative purpose.

Does it really matter?

Our decision to remove the ability to set the og:image:alt tag is based on exploring the actual usage the main social platforms make of it. At the time of writing this post (June 2022), it appears the og:image:alt is either not used or it is used in a way that’s potentially problematic for accessibility and SEO.

Facebook

Based on our testing, when sharing a post on Facebook, the alternative text of the linked image within the post preview is set to the og:title value. That’s the linked post title and it’s perfectly appropriate because it describes the link destination.

Let’s see an example (please consider that Facebook’s implementation may vary over time). This post on Facebook links to a BBC News page that does use an og:image:alt tag.

By inspecting the page source, we can see the tag content value:

However, the text People walk on hot coals in Lithuania isn’t used anywhere on the Facebook share. Instead, the linked image alternative text is set to the value of the og:title tag:

Hot coal walk leaves 25 injured in Switzerland

Thankfully, I’d say, because the og:image:alt refers to Lithuania while the news is about people injured in Switzerland!

Providing the og:image:alt tag for Facebook seems a bit pointless, as it’s not used.

Twitter

The Twitter implementation is a bit more complicated from a technical perspective. It seems that it does use the og:image:alt tag and adds its content to the link labeling mechanism, which is incorrect.

A Twitter card is made of two main sections. The first one contains the linked image. The second one is the card footer, which is a link that contains the post title and the post description:

Technically, the linked image is not exposed to assistive technologies such as screen readers, because:

The linked image is placed within a container element with an aria-hidden="true" attribute, which hides it from assistive technology.
The link itself is moved out from the keyboard Tab order by the means of a tabindex="-1" attribute.

The whole ‘card’ container element is clickable via JavaScript. The container element is then labeled by the means of an aria-labelled attribute that references both the first section of the card (the image container) and the second section (the footer).

The important thing to understand is that assistive technology will announce the card container using the content of the elements referenced by the aria-labelledby attribute. Specifically, they will announce, in the following order:

The alternative text of the linked image in the first section, as provided by the og:image:alt tag.
The entire textual content of the card footer.

When sharing on Twitter our BBC News post example, the actual text that labels the Twitter card would be:

People walk on hot coals in Lithuania bbc.com Hot coal walk leaves 25 injured in Switzerland The group suffered burns after walking over a bed of coals as part of a team building exercise.

That’s incorrect because it adds the image content description to the link destination description. Also, this specific example would definitely be confusing. Again, is it about Lithuania or Switzerland?

Our take on the og:image:alt tag

There’s some good intent in providing the og:image:alt tag, but it all depends on the image usage and context. Based on our testing, it appears the og:image:alt can be easily misused and lead to non-ideal results. We do believe it’s not something that users or an SEO plugin should attempt to manage. Instead, we do believe that not setting the og:image:alt tag is the best option. With Yoast SEO, we do provide the image, the post and metadata, and it’s up to social media platforms to present those in an accessible manner.

The post Why we don’t set the og:image:alt tag appeared first on Yoast.

The feature that almost made it

Joost de Valk — Tue, 08 Mar 2022 11:42:01 +0000

This post explains a feature we built, and then removed from Yoast SEO 18.3 just before release. It could have been pretty annoying when we released it, so I’m happy we didn’t but I also wanted to share just how hard this sort of thing is to prevent.

The feature idea

On February 15th, I saw a tweet by my friend and very well respected SEO Jon Henshaw, with an absolutely awesome idea: adding a QR code to the end of a page when it’s printed, so you could easily get back to the online version. Super simple, very clever, I loved it.

https://twitter.com/coywolf/status/1493630040256483338

I loved it so much, I reached out to Jon and said “would you mind if I added this to Yoast SEO?“, to which Jon responded:

It’s an idea meant for sharing and copying. I’ll take it as flattery :)

You can see why I like Jon.

So I built the feature. I tried to build it such a way that we wouldn’t load the image when the page wasn’t being printed. I didn’t succeed at that at first, but then Herre, our lead architect, helped me and we got to a very nice bit of code that would inject a bit of HTML into the DOM right before printing. That piece of HTML would load a QR code that would be generated at that point (just-in-time), so we weren’t wasting any resources when it wasn’t needed.

Security in mind

In generating that QR code, we needed to consider the security aspect: we couldn’t create a QR code generator that would allow rendering QR codes for all the web. So we added a nonce, and verified that nonce before generating. Since the nonce was time-based too, it would stop working after a while. If you’d pass it an invalid nonce or no nonce at all, the code would return a 400 HTTP status code.

All of this combined made us think we’d made a fun little feature, and a quite solid one in fact. So we merged it to our development branch and it went into our Release Candidate.

Test, test, then test some more

If there’s one thing we’ve learned at Yoast over time it’s that testing is paramount. Our QA team is, I dare say, one of the best in the WordPress ecosystem. We test every Release Candidate (RC) for two weeks. In all sorts of scenarios, we also deploy this RC to yoast.com to catch errors there.

The point with testing is: you need to know what you’re looking for. And this feature was working very nicely. It worked as expected, everything was going fine. Ready for release. Until I checked our GSC error report on the morning of the 18.3 release and saw this:

Google Search Console report indicating errors on yoast.com for QR code URLs.

Why is Google crawling these? Why is it giving errors?

So, first of all, I thought: why is Google crawling this? These URLs aren’t in the normal DOM, there are only two ways Google could have found these: either it’s crawling all URLs it finds on a page or it’s pretending to be “printing” every page. The latter seems a bit far-fetched, so I think we can safely assume that Google is crawling every URL it finds, even if it’s inside a piece of JavaScript. I think that’s… Ugly. But it’s not something I can prevent them from doing.

The output looked something like this (minified, of course):

So Google was grabbing the URL from that img tag inside the JavaScript. That’s going quite far if you ask me.

The second question is: why is this giving errors? Because if you crawled this, those URLs would work. Except that these nonces are time-based, so if you’d crawl them again a couple of days later, they would stop working. And it turns out, from our bot logs, that that’s exactly what happened!

Crawling of QR code URLs on yoast.com in our bot logs.

When I saw this I was super glad we built it like this, because we’d never have seen the error if we hadn’t built it this securely. And not seeing the error would mean we probably hadn’t seen that this adds a ton of crawling for every site it’s added to.

So we’re adding a new layer of reports to look at before we release a feature like this, to see if it’s changing crawl behavior. Unfortunately, with all the new APIs Google is launching for Google Search Console, we still haven’t gotten an API that actually gives us all the errors for a site. That’s an API that used to exist but no longer does, and we need that back badly. Google Search Console also doesn’t email you when you suddenly have hundreds of new errors on your site. Which to me is weird because it will email you when you have one typo in your Schema somewhere.

Conclusion

Of course, when we saw all this, we took the feature out of our Release Candidate. It’s the only logical conclusion: this will add more unnecessary things for Google to crawl, which is the opposite of what we try to reach with Yoast SEO. We’ll try to come up with a better approach to this feature, but it’s not worth causing extra crawls, which has a very real energy usage and thus environmental impact, so unless we find a reliable way to prevent that, it won’t make it.

When evaluating new features, you already needed to take a ton of things in mind: is it fun? Will it add something meaningful? Will it be worth it to maintain it over time? Does it add potential security vectors? We can now add another question to that list: will search engines crawl the output in weird ways and will that cause issues?

The post The feature that almost made it appeared first on Yoast.

Behind the front-end: Yoast SEO for Shopify

Nolle Groen — Tue, 25 Jan 2022 13:21:18 +0000

It is April 2021 and amidst a global pandemic, we are about to embark on our greatest venture yet: developing an SEO app for Shopify. For a development department so focused on WordPress for so long, this is both a dream and a nightmare. Starting fresh on a new platform offers endless possibilities, but even more challenges along the way. Now, after months of intense development, Yoast SEO for Shopify has seen the light of day. And I feel obliged to share this technical journey with you. Not because I think it needs explaining, but because I couldn’t be more proud of the giant leap forwards we took with this project. This is a story of how we, from our attics and kitchen tables, brought ‘SEO for everyone’ to Shopify.

Well, actually… This is merely a look behind the front-end of our new Shopify plugin. My name is Nolle, I’m a front-end developer at Yoast and together with the Components squad, I’m to git blame for just that. I want to start by taking you through the main technologies driving the user interface and I want to end with why we didn’t actually build this front-end for Shopify at all. Come again?

Technologies

Our tech stack was primarily determined by what we’re familiar with and what Shopify promotes. You can experiment when you get a clean slate like this. But creating a really robust solution based on past experiences is even better. And that’s exactly what we did, though there are some firsts in the list below.

React

React obviously needs little introduction. It’s already powering most of our highly interactive interfaces in WordPress and for Shopify we took this even further. The front-end is completely written in React and consists of two single page applications. One for general SEO settings and one for optimizing content (more on this separation later). This makes for a user experience that feels much faster and is packed with cool features like optimistic UI, skeleton loaders and live form feedback.

Technically, we’ve moved away completely from class-based components. We’re now committed to a system of smart and dumb functional components. Smart (controller) components are separated from their dumb (view) counterparts so that the latter can be generalized and reused. We’ve also started to follow a ‘hooks over HOCs’ principle. A principle where we favor React hooks over higher-order components (and basically everything else) as much as possible. This all makes for a modern and future-proof React codebase.

Redux and WordPress Data

Like React, Redux is a familiar face here at Yoast. We’ve been using it to manage centralized state in our JavaScript applications for a long time. In our Shopify plugin, we’re using the WordPress Data module to work with Redux because of its familiarity and benefits. It wraps the core Redux API in a thin layer of optimizations like selector memoization and a Redux Saga like system for writing asynchronous actions, called controls.

Next to a nice and nested state structure, we feel like this store has become one of our cleanest yet by adhering to a few simple rules. For instance, we got rid of those dreadful state Booleans (isLoading, isSuccessful, etc.) and replaced them with status constants. That way, the state of something asynchronous for instance can be traced back to one place. Instead of deriving it from multiple Booleans. No more weird edge cases in-between states!

// Replace those bug prone state Booleans
// with status constants.

// Using status booleans in state.
const state = {
  isLoading: true,
  hasError: false,
};

// Using status enums in state.
const state = {
  status: "loading", // || "idle" || "success" || "error"
};

We’ve also tried to be strict in writing actions in an event-like manner. Meaning we don’t ‘set’ anything in the store, we only ‘inform’ it that something (an event) has occurred. For instance, the difference between setActiveItem and itemActivated is subtle. But the latter is not at all coupled to a specific reducer as the former indicates. In a good Redux store, every reducer is able to respond to every action if needed. Again, following simple rules like these has made all the difference for us in creating a robust store.

// Event-like Redux actions promote looser coupling
// between reducers and the actions it responds to.

// Using setter like actions.
const authenticationReducer = ( state, action ) => {
  switch ( action.type ) {
    case "setIsAuthenticated": return {
      ...state,
      isAuthenticated: action.payload.isAuthenticated,
    };

    case "setUsername": return {
      ...state,
      username: action.payload.username,
    };

    default: return state;
  }
};

const notificationReducer = ( state, action ) => {
  switch ( action.type ) {
    case "setAuthenticationNotification": return [
      ...state,
      {
        type: "success",
        message: "You were successfully authenticated!",
      },
    ];

    default: return state;
  }
};

// Using event like actions.
const authenticationReducer = ( state, action ) => {
  switch ( action.type ) {
    case "user/authenticated": return {
      ...state,
      status: "authenticated",
      username: action.payload.username,
    };

    default: return state;
  }
};

// Another reducer reacts to the same action.
const notificationReducer = ( state, action ) => {
  switch ( action.type ) {
    case "user/authenticated": return [
      ...state,
      {
        type: "success",
        message: "You were successfully authenticated!",
      },
    ];

    default: return state;
  }
};

Tailwind

With great UX comes great CSS responsibility. At Yoast, like most other companies, we witnessed first-hand that writing a scalable and maintainable CSS codebase is complicated. We’ve tried everything from plain CSS, preprocessors and CSS-in-JS options to all kinds of frameworks. But we never got a firm grasp on dealing with consistency, exceptions and learning curves. Finally, with Tailwind, we feel we’ve found a durable solution: stop writing CSS altogether.

Tailwind is a utility-first CSS framework and the concept is simple: combine single-purpose CSS classes on your elements to get the styling you need. Instead of combining CSS properties in your stylesheets to get there. Can you see how this immediately solves the scalability issue? Writing CSS becomes a matter of finding the right combinations of utilities instead of, well, writing actual CSS. Of course, Tailwind provides a solution for combining utilities into new class names for easy application, but you’re still not really writing CSS. Tailwind comes with great documentation including many examples, is easy to configure and provides great performance by purging unused styles. I recommend it to everyone.

// Combine multiple Tailwind utility classes into a custom
// .button class using the @apply directive.

.yst-section {
  @apply yst-max-w-5xl lg:yst-grid lg:yst-grid-cols-3 lg:yst-gap-8 yst-mx-8 yst-mb-8 yst-border-b yst-border-gray-200 yst-pb-8;
}

Platform agnosticism

Now let’s get back to that thing I said about this front-end not being built for Shopify at all. With this project, we set out to create a universal Yoast interface, not specifically tailored to one platform. So instead of building a user interface, we will actually be building an interface for building user interfaces. A higher-order user interface, if you will. This kind of agnostic approach offers challenges of its own, mainly those considering flexibility.

Configurability

Not all platforms offer the same editable content types. And some platforms might not even benefit or support certain SEO features at all. In other words, the front-end has to be highly configurable to fit the needs and capabilities of many platforms. Almost like an SEO settings form builder.

We solved this technically by exposing an app initializer function that accepts the configuration for the specific platform at hand. This object mainly consists of a list of content types and their supported SEO features and a list of generally supported SEO features. Based on this configuration, the initializing function creates a Redux store and other contexts, sets up routes and form components and registers callbacks for handling retrieving and saving data. It then returns an app object with a render method. Which is basically just a ReactDOM.render call with its first argument, the component, already supplied. Now the implementor can render the app in a DOM element of choosing while remaining unaware of the technology responsible for that rendering.

This approach has proven to offer the flexibility and decoupling we were looking for. Everything the user interface needs to know from the implementor is shared through a single configuration object. In theory, we could now switch out Redux for React’s useReducer. Or replace React with Vue entirely without having to touch the implementor whatsoever. Magic…

// A basic example of how to initialize
// and render the new Yoast user interface.
import initializeSettings from "@yoast/admin-ui/settings";

const { render } = initializeSettings( {
  isSetting1Enabled: true,
  isSettings2Enabled: false,
  contentTypes: [
    {
      name: "post",
      isContentTypeSetting1Enabled: false,
      ...etc,
    },
  ],
  handleSave: async ( data ) => {
    const response = await saveData( data );
    return response.status;
  },
  ...etc,
} );

// Start rendering React in the #root.
render( document.getElementById( "root" ) );

Specifics to Shopify

Of course, each platform is going to have its own little caveats and Shopify was no exception. There is a thin layer of JavaScript that sits between the UI and the API to tackle these and make sure the interface can remain agnostic. Since this is a post about our Shopify plugin, I want to quickly highlight the main responsibilities of this in-between adapter layer specific to Shopify.

Extending the Shopify editor

Firstly, Shopify does not offer a way to properly extend its content editor with a Yoast sidebar as WordPress does. Therefore, the Shopify plugin features a separate React app for optimizing your content’s SEO using our own little editor. This editor offers a clean and simple experience and supports almost all content relevant to SEO. By hooking into Shopify’s plugin navigation API, users can switch between their general SEO settings and the Optimize app. While staying comfortably within the Yoast plugin.

File uploading

File uploads proved to be a particularly challenging feature. It might sound simple, but each platform has its own way of dealing with uploads. While in WordPress we could rely on the media library, in Shopify we had to write the upload logic ourselves. Therefore, the agnostic front-end only shows a dumb file uploader that leaves its behavior on user interaction to a callback passed through the configuration object. In Shopify, this callback renders a hidden file input and immediately triggers a click on that element causing the browser to show its native file browser. On other platforms that logic might be completely different.

Blogs vs. blog posts

A quirk in Shopify we had to overcome is that the blog post content type is coupled tightly with the blog type. In WordPress terms, the blog type is a separate post type but also a mandatory taxonomy of the blog post type. The pain here being you are unable to retrieve blog posts using the Shopify API without specifying a blog identifier as well. We didn’t want this structure to break our pattern of optimizing per content type. So we decided to show each term of the blog ‘taxonomy’ (i.e. blog #1) in the main navigation. This way, you won’t be optimizing blog posts, but you will be optimizing blog posts in blog #1. Luckily our setup proved flexible enough to handle this and the user experience is much better for it.

Data transformation

Lastly, by designing the front-end’s data structure without a specific platform in mind, we were bound to have some data mismatches between API and UI. A nicely structured transformation layer to the rescue! Each callback passed through the configuration object (i.e. handleSave) is responsible for transforming the data it receives into the structure it expects.

const transformDataForApi = ( data ) => ( {
  seoTitle: data.seo.title,
  metaDescription: data.seo.description,
  ...etc,
} );

const handleSave = async ( data ) => {
  const dataForApi = transformDataForApi( data );
  const response = await saveData( dataForApi );
  return response.status;
};

Conclusion

I hope I gave you some insight into our fun and challenging journey of building the front-end for Yoast SEO for Shopify. As I said, I’m very proud of the level of abstraction we obtained in this project, paving a clear path for the future. Right now, we’re very much looking forward to receiving user feedback, so we can keep on improving their experience.

If you want to read more about why we’ve built this app and the whole process around it, our CEO Thijs de Valk wrote an interesting piece on how we scoped, built and launched our Shopify app. Thanks for reading and if you have any front-end related questions, I’d be happy to answer them in the comments. Have a great day!

The post Behind the front-end: Yoast SEO for Shopify appeared first on Yoast.

Site search with Algolia – improved

Joost de Valk — Tue, 13 Jul 2021 09:36:35 +0000

Here at Yoast, we’ve been using Algolia to power our site search for quite a while now. Specifically, we’re using the WP Search with Algolia plugin by WebDevStudios to index our content. And we were reasonably happy with that set up. But then we found a better way.

A few weeks ago I came up with a relatively simple improvement that actually dramatically improved the quality of our internal search results: using the number of internal links pointing to a page as a custom ranking attribute.

Yoast SEO’s link metrics

Yoast SEO maintains a table with link metrics within your site. For every post or page, we know how many other posts or pages link to that page from within the content. On a site that has managed its internal linking well, this means that your search is very easily improved with a very reliable ranking metric. If you’ve been using our cornerstone content functionality and our internal linking suggestions, your best pages should suddenly rank easily for their important terms.

In fact: if you use our internal link metrics like this, your site search becomes a great way to see whether you’ve done your internal linking right.

The meta description & social images

On top of this new feature, we also send the meta description of the post and its social image along. While it takes a slightly more complex Algolia setup, you might be able to use this in your search result listings, as we do here on yoast.com. The meta description field is simply called yoast_seo_metadesc and the social image can be found in the images array, under the social key.

Get this implementation with Yoast SEO Premium 16.7

It’s so easy to improve your site search using this method that it’s almost a no-brainer. If you want to learn how to set up Algolia with our link count as a ranking factor, you’ll need a couple of things first:

WP Search with Algolia plugin (free and paid plans available)
Yoast SEO Premium version 16.7 or higher
Well-maintained internal links (use our SEO workout to check and set these more quickly).

Got those ready? Then you just need to follow a couple of steps to re-index and add the new custom ranking attribute we’ve made. And that’s it!

The post Site search with Algolia – improved appeared first on Yoast.

The 2020 WordPress and PHP 8 compatibility report

Omar Reiss — Wed, 28 Oct 2020 11:13:40 +0000

On November 26, PHP 8 will be released to the world. PHP 8 is set to become one of the most breaking PHP releases in the history of PHP and will bring along unprecedented challenges for legacy PHP codebases, like WordPress, to fix compatibility.

Today we bring you a comprehensive report on WordPress and PHP 8 compatibility. In sharing this, we hope to educate and help inform both the WordPress and PHP communities about the state of WordPress and PHP 8. After all, PHP is the technology that powers WordPress and WordPress is by far the largest consumer of PHP.

This report is a team effort by Juliette (a PHP engineer well-respected in both the PHP and WordPress communities), Herre (Chief software architect at Yoast) and myself.

Introduction
Part 1: The most worrisome breaking changes in PHP 8
Part 2: Compatibility challenges
Part 3: yoast.com case study
- What did we do?
- Analysis
Conclusion

Introduction

What’s in this report?

In the first part of this report we’ll outline the changes in PHP 8 that are likely to significantly impact WordPress and other legacy codebases. In the second part of this report, we’ll try to provide a perspective on past, present and future challenges regarding WordPress and PHP compatibility. At the end of the report we’ve included a case study of Yoast.com, to illustrate what kind of issues are likely to occur for a large WordPress site on PHP 8.

How come there are so many breaking changes in PHP 8?

PHP 8 is a major update of PHP and it is common practice to remove deprecations in major versions from the previous range of minor versions. For PHP 8, many of the breaking changes have been deprecated in previous 7.* versions. So for projects that were diligently updated over the years fixing their deprecated API’s, it shouldn’t be hard to upgrade at all.

However, PHP 7.* versions have seen a far larger set of deprecations than previous versions of PHP. Where PHP 5.6 to PHP 7 was a relatively simple migration, going from 7.x to 8 could be very painful, especially for very old codebases, like WordPress and many of the plugins that are available for it. For well-typed codebases or codebases which have stayed up-to-date with the latest PHP versions, there isn’t a big problem. The reality, however, is that WordPress is not such a codebase.

Isn’t WordPress already compatible with PHP 8?

Well… Yes. Sort of. Maybe. We are highly doubtful. It’s really not possible to tell.

WordPress aims to always be compatible with new versions of PHP. Sergey did an amazing job in fixing most of the compatibility issues that could be detected using the available strategies. We’ll definitely dive deeper into what those are and the issues that exist with them. Technically, the compatibility of the current nightly of WordPress with PHP 8 is at the same level as we’re used to from WordPress releases right before a new version of PHP comes out. We believe the testing was as extensive, the fixing was as diligent and the level of fixes was as high as any round of PHP compatibility fixing within WordPress core. See also the call for testing on PHP 8 on WordPress.org.

However, doing what we’ve always done, unfortunately, will not cut it this time. The sheer amount of breaking changes and the type of changes included in PHP 8, plus some added complexities in cross-version tooling, make this compatibility challenge a different beast from what we’ve seen before. This report aims to explain how that is the case.

Part 1: The most worrisome breaking changes in PHP 8

Strict typing on internals in PHP 8

One of the most important breaking changes in PHP 8 has to do with strict typing. User-defined functions in PHP already throw a TypeError. However, internal functions emitted warnings and returned null. PHP 8 makes this consistent and internal functions now also throw a TypeError. This will not only impact functions that already threw warnings prior to PHP 8, but also magic methods (which previously weren’t type checked) and functions that have had type declarations introduced. For this reason, it’s not possible to catch all issues that arise from this change by fixing the type warnings in PHP 7.4 environments. Below is an overview of related breaking changes that together define the scope of strict typing related changes in PHP 8.

Consistent type errors

As of PHP 8 internal functions now throw a TypeError for all typed arguments.

Arithmetic operator type checks

The arithmetic and bitwise operators +, -, *, /, **, %, <<, >>, &, |, ^, ~, ++, — will now consistently throw a TypeError when one of the operands is an array, resource or non-overloaded object. The only exception to this is the array + array union operation, which remains supported.

Before PHP 8, it was possible to apply arithmetic or bitwise operators on arrays, resources or objects. This isn’t possible anymore, and will throw a TypeError.

Magic methods type checks

Magic Methods will now have their arguments and return types checked if they have them declared. The signatures should match the following list:

__call(string $name, array $arguments): mixed
__callStatic(string $name, array $arguments): mixed
__clone(): void
__debugInfo(): ?array
__get(string $name): mixed
__invoke(mixed $arguments): mixed
__isset(string $name): bool
__serialize(): array
__set(string $name, mixed $value): void
__set_state(array $properties): object
__sleep(): array
__unserialize(array $data): void
__unset(string $name): void
__wakeup(): void

Numeric string handling

Numeric string handling changed to be more intuitive and less error-prone. Trailing whitespace is now allowed in numeric strings for consistency with how leading whitespace is treated. This mostly affects:

The is_numeric() function
String-to-string comparisons
Type declarations
Increment and decrement operations

The concept of a “leading-numeric string” has been mostly dropped; the cases where this remains exist in order to ease migration. Strings which emitted an E_NOTICE “A non well-formed numeric value encountered” will now emit an E_WARNING “A non-numeric value encountered” and all strings which emitted an E_WARNING “A non-numeric value encountered” will now throw a TypeError. This mostly affects:

Arithmetic operations
Bitwise operations

This E_WARNING to TypeError change also affects the E_WARNING “Illegal string offset ‘string'” for illegal string offsets. There are no changes in the behavior of explicit casts to int/float from strings.

Named parameters

Support has also been added for named parameters. This has two major implications:

Renaming parameters becomes a breaking change. If a parameter is renamed then anywhere that function is called with named parameters will break.
The behaviour of call_user_func_array() changes. Previously call_user_func_array() could be called with an associative array. Now passing an associative array will be interpreted as using named parameters, which will cause an Exception to be thrown, if any of the named parameters do not exist.

API changes which could lead to type errors

Below we’ve compiled a list with some examples of API changes that will lead to type or argument errors where there were no indications as such in previous PHP versions.

mktime() and gmmktime() now require at least one argument. time() can be used to get the current timestamp.
spl_autoload_register() will now always throw a TypeError on invalid arguments, therefore the second argument $do_throw is ignored and a notice will be emitted if it is set to false.
assert() will no longer evaluate string arguments, instead they will be treated like any other argument. assert($a == $b) should be used instead assert(‘$a == $b’). The assert.quiet_eval ini directive and the ASSERT_QUIET_EVAL constant have also been removed, as they will no longer have any effect.
The $args argument of vsprintf(), vfprintf(), and vprintf() must now be an array. Previously any type was accepted.
Arguments with a default value that resolves to null at runtime will no longer implicitly mark the argument type as nullable. Either use an explicit nullable type, or an explicit null default value instead.

Warnings converted to error exceptions

There are a large number of PHP warnings that have been changed to error exceptions in PHP 8.

Error level changes unrelated to RFC’s

The following warnings have been converted to errors probably related to deprecations in PHP 7.x versions:

Attempting to write to a property of a non-object. Previously this implicitly created an stdClass object for null, false and empty strings.
Attempting to append an element to an array for which the PHP_INT_MAX key is already used.
Attempting to use an invalid type (array or object) as an array key or string offset.
Attempting to write to an array index of a scalar value.
Attempting to unpack a non-array/Traversable.

Reclassified engine warnings

Lots of errors that previously only triggered warnings or notices, have been converted to errors. The following were changed:

Undefined variable: Warning instead of notice
Undefined array index: warning instead of notice
Division by zero: DivisionByZeroError exception instead of warning
Attempt to increment/decrement property ‘%s’ of non-object: Error exception instead of warning
Attempt to modify property ‘%s’ of non-object: Error exception instead of warning
Attempt to assign property ‘%s’ of non-object: Error exception instead of warning
Creating default object from empty value: Error exception instead of warning
Trying to get property ‘%s’ of non-object: warning instead of notice
Undefined property: %s::$%s: warning instead of notice
Cannot add element to the array as the next element is already occupied: Error exception instead of warning
Cannot unset offset in a non-array variable: Error exception instead of warning
Cannot use a scalar value as an array: Error exception instead of warning
Only arrays and Traversables can be unpacked: TypeError exception instead of warning
Invalid argument supplied for foreach(): TypeError exception instead of warning
Illegal offset type: TypeError exception instead of warning
Illegal offset type in isset or empty: TypeError exception instead of warning
Illegal offset type in unset: TypeError exception instead of warning
Array to string conversion: warning instead of notice
Resource ID#%d used as offset, casting to integer (%d): warning instead of notice
String offset cast occurred: warning instead of notice
Uninitialized string offset: %d: warning instead of notice
Cannot assign an empty string to a string offset: Error exception instead of warning
Supplied resource is not a valid stream resource: TypeError exception instead of warning
#The @ operator no longer silences fatal errors
It’s possible that this change might reveal errors that again were hidden before PHP 8. Make sure to set display_errors=Off on your production servers!

Fatal error for incompatible method signatures

Inheritance errors due to incompatible method signatures between two classes will now throw a fatal error instead of a warning.

Default error reporting level

With PHP 8 the default error reporting level is changed to E_ALL instead of everything but E_NOTICE and E_DEPRECATED. This means that many errors will start showing up which were previously silently ignored.

7.x deprecations

During the PHP 7.x release cycle, each version introduced new deprecations, which have now been finalized as feature removals in PHP 8. This also applies to some deprecation which were already put in place in PHP 5.x, but weren’t removed in the PHP 7.0 release.

Most notably, the following, already deprecated features, have been removed in PHP 8:

$php_errormsg
create_function()
mbstring.func_overload
parse_str() without second argument
each()
assert() with string argument
$errcontext argument of error handler
String search functions with integer needle
Defining a free-standing assert() function
The real type (alias for float)
Magic quotes legacy
array_key_exists() with objects
Reflection export() methods
implode() parameter order mix
Unbinding $this from non-static closures
restore_include_path() function
allow_url_include ini directive

Other breaking changes.

There are many other (breaking) changes in PHP 8, including significant changes to the return types of select, but often-used, functions, resources being turned into Value Objects and more. Examples for GD, OpenSSL, Sockets, XML, Zlib, substr() et al.

Above we’ve tried to highlight the ones that are likely to directly impact WordPress (and many other legacy systems) in a significant way. We’ve based this overview on the excellent guide “What’s new in PHP 8” on stitcher.io and the PHP 8 upgrade guide. For more information we refer you to those sources.

Part 2: Compatibility challenges

To make an existing codebase compatible with a new version of PHP, there are a couple of different strategies that can be deployed for discovery:

Static analysis tools like PHPCompatibility to detect syntactic issues.
Automated testing to detect runtime issues.
Manual testing to detect runtime issues.

Depending on the coverage of your test suite and the proportion of runtime and syntactic changes, these strategies serve as a good basis for fixing the compatibility of a codebase with a new version of PHP. However, in the case of WordPress and PHP 8, there are quite a few extra challenges which makes it hard to rely on these strategies for fixing compatibility and declaring WordPress compatible with PHP 8. Below we’ll report on which strategies have been deployed for WordPress and what the results were.

Static analysis tools

Due to the nature of some of the changes in PHP 8.0, the issues which can be found using static analysis are limited. And in those cases where static analysis tries to go beyond their traditional capabilities and tries to trace the runtime type and value of variables and constants, the results of such scans will be very prone to false positives.

Aside from that, PHPCompatibility is the only static analysis tool dedicated to finding issues related to PHP cross-version compatibility.

Other static analysis tools will report on a far larger scope of issues. Wading through the results to find the issues which are related to PHP cross-version compatibility and actually correct, is very time-consuming and requires in-depth knowledge of the tooling to configure them for the least amount of noise.

At the same time these tools are in constant flux, trying to keep up with the changes in PHP and updating the available scans, so independently of what has been and can be found at this time, chances are that these tools will find still more issues in the (near) future.

Scanning WordPress with PHPCompatibility

Issues reported early on in the WP 5.6 dev cycle and fixed since based on PHPCompatibility scans:

Another PHP 8 issue detected by PHPCompatibility is “__destruct() will no longer be called after die() in __construct()”. This is correctly detected by the scanner, but upon further analysis has been determined not to be problematic in this particular case.

PHPCompatibility has also detected an issue in code used by the Plugin/Theme editor. Analyses of the involved code has determined there is an underlying oversight in the code. WordPress tries to do minimal analysis of the code in the editor, but doesn’t take PHP 5.3+ (namespaced) code into account. This oversight will now just be made more complex to solve while taking related changes in PHP 8.0 into account.

Scans with PHPCompatibility have been run with the develop version. No version has been released yet containing the PHP 8 specific scans.

Issues detected by the scanner in externally maintained dependencies have been reported there.

Scanning WordPress with Exakat

As of the latest public scan, based on WP trunk of October 16th, Exakat reports a total of 149.567 issues.

The PHP 8 compatibility report contains a total of 93 issues, but is incomplete as a number of analyses relevant for PHP 8 are not (yet) included in the report.

The “worst” offender based on the PHP 8.0 compatibility report, is parameters in method declarations in child classes being named differently from the parameter in the parent class. This is incompatible with the new PHP 8.0 “named parameters in function calls” feature.

This has been reported in issue Trac 51553 and a patch for this has been attached to the ticket, but has not yet been committed.

Other tasks which ought to be executed to prepare for named parameters in function calls have been listed in this ticket as well, including an action list for one of these tasks. No action has been taken on any of these so far.

The 12 “Unsupported Types with Operators” warnings are mostly false positives and this has been reported to Exakat.

More worrisome are the 14.679 issues reported for using the wrong argument type, 14.135 issues for wrong type with call, 15.605 issues reported for wrong number of arguments, 801 wrong type for native PHP function and 25 wrong parameter type issues.

While it is expected that these reports will contain a large number of false positives as WordPress doesn’t use type declarations and the types are therefore extrapolated from the code found and the types indicated in docblocks, these issues should still be examined individually. Even if only just 1% of the found issues is correct, that would still come down to ~450 errors which still need to be dealt with. Quite apart from the huge amount of time needed to weed out the real issues from the false positives. At the time of writing, the author is not aware of any efforts being made to examine these reports and identify and fix these issues.

A few of the analysis reports by Exakat for other issues related to PHP 8, not contained in the PHP 8.0 compatibility report, have been examined. Patches for these have been submitted and committed over the last week. This includes:

A fix for a PHP 8.0 fatal error in the WP Revisions module.
A fix for a PHP 8.0 warning in the pomo library.

Scanning WordPress with PHPStan

Scans with PHPStan need a highly customized ruleset to get remotely usable results and even then, they are riddles with false positives, making the output neigh unusable.

Note: this is not necessarily criticism of the PHPStan tooling, but is largely due to the fact that WordPress barely uses type declarations, while PHPStan is primarily geared towards projects using modern code.

An initial scan with the most basic of configurations, yields well over 20.000 issues.

A scan with the above mentioned, highly customized ruleset, aimed at PHP 8 related issues specifically, still yields 580 issues at level 5, another 2.150 potential issues at level 7, though these will likely contain a lot of false positives and yet 380 more issues at level 8 with the same caveat.

A Trac ticket was opened a while back to address a list of issues based on an unknown configuration, but specifically aimed at passed parameter type mismatches (level 5). There is a draft PR available to fix these issues.

An initial assessment of this PR, however, shows that the majority of fixes proposed would hide issues by typecasting variables to the expected type, not actually fix them by doing proper type checking. This can lead to unexpected behaviour in the application if these changes are not accompanied by strict unit tests (and they are not). This will also likely result in much harder to debug errors further down the line.

At this time, it has not been verified whether the fixes proposed are even warranted or that the issues identified should be considered false positives.

Scans run with PHPStan 0.12.52.

Testing

Due to the nature of the problematic changes in PHP 8.0, static analysis can only go so far. Manually reviewing and testing software is painstaking work and humans are very prone to overlook things when there is a lot to be looking out for.

Manually testing performed by end-users tends to be relatively useless, as this will generally only result in “happy paths” being tested. Comprehensive exploratory and regression testing is needed to achieve more reliable results. And even if problems are found, it requires extensive debugging to figure out the cause of the problem – WordPress ? Plugin Theme ? – and whether it is related to PHP compatibility.

More than anything it is important to have automated tests of good quality and to run these on PHP 8, as this will give the best indication of PHP 8.0 problems to expect.

Running automated tests on PHP 8

Getting an automated test suite to run on PHP 8 takes us down the next rabbit hole as the de facto tool for unit testing in the PHP world, PHPUnit, generally does a major release every year and with each major drops support for older PHP versions and introduces breaking changes. The first PHPUnit version which is officially compatible with PHP 8.0 is PHPUnit 9.3, released August 2020.

As WordPress still supports PHP 5.6 as a minimum, to run tests on PHP 8.0, any WordPress related test suite will have to be compatible with PHPUnit 5 up to PHPUnit 9.

While tooling is being built to help with that (look out for a blogpost about this over the next week!), it still takes time and effort to implement these tools and make a test suite compatible. Time which is taken away from the time available to actually fix PHP 8 related problems.

Getting the tests to run on PHP 8 for WP Core

The tests for WP Core are run against PHP 8 and are currently passing. The tests are being run on a composer installed version of PHPUnit 7.5, even though PHPUnit 9.3 is the earliest PHPUnit version officially compatible with PHP 8.0.

This last point has been overcome by copying a select number of files/classes from PHPUnit 9.3 to the WP test suite, excluding the PHPUnit native classes from the Composer autoload generation, in favour of using the copies from PHPUnit 9.3 in the WP test suite. While this works for now, this is a hacky solution and may not be sustainable in the future, aside from the maintenance it may currently require.

Note: while all other WordPress Core CI builds use the Phar version of PHPUnit, this is not possible when running PHPUnit 7.5 on PHP 8.0 as PHPUnit 7 is no longer maintained and the Phar contains incompatible dependencies of PHPUnit. Using a Composer based install of PHPUnit overcomes this as it will pull in the latest compatible versions of the dependencies, though –ignore-platform-reqs is needed.

As for the quality of the tests, this was relatively low to begin with, with loose type checking being used in the majority of cases.

A Trac ticket to address this was already opened in 2016. With an eye on the stricter type adherence in PHP 8.0, this ticket has been revived and a lot of work has been done to mitigate this.

At the time of writing, there are nearly 800 instances (676 assertEquals() + 96 assertNotEquals()) still using loose type checking – down from over 8000 instances.

In part the remaining loose type assertions are legitimate when objects are being compared, in part these still need to be addressed, but would currently result in test failures. These last ones highlight shortcomings either in the tests, but more often in the code being tested.

Code Coverage

When evaluating the value of the current test suite, it is important to look at the code coverage of the test suite and to understand how code coverage works.

Code coverage can be reported in two ways:

Unclean
Clean (strict)

To explain the difference and why this difference is important, we take a simple code sample:

function bar() {
    // Do something
}

function foo() {
    $bar = bar();
    // Do something more.
}

Now, if there are tests for method foo() and code coverage is not set up to be strict, the code coverage for the above code will be reported as 100% as both methods are called when the tests are being run.

In contrast, if @covers tags are used, the code coverage for the above code would be reported as 50% (only method foo()).

When finding PHP 8.0 related issues, this difference is important as a method with dedicated tests will – if things are done correctly – be “stress tested”, i.e. tested with different types and values of input, making sure the method handles unexpected situations correctly.

In contrast, the method which does not have dedicated tests, but is “incidentally” covered by a test, will only have been tested for the “happy path”, i.e. giving a result so the test for the method really being tested, can continue.

As most new fatal errors in PHP 8.0 are related to the “unhappy path”, it is important to have dedicated tests for all code and for these dedicated tests to cover both the “happy path”, as well as the “unhappy path”.

By improving the test suite to use @covers tags, it will become clear which part of the code base does not have dedicated tests. This will allow for setting informed priorities for expanding the test suite, starting with the most important parts of the code base which don’t have dedicated tests.

Four years ago, an issue was already opened to draw attention to the missing @covers tags. This dormant ticket has also been revived with an eye on PHP 8 and a number of patches to add @covers tags have been submitted over the past few weeks. None of these have been committed so far.

At this time, WordPress has mostly “unclean” code coverage. A scan in August of this year showed an unclean code coverage of 39.82%. A scan this week, showed an unclean code coverage of 40.32%, which is a 0.5% improvement.

Based on previous experiences, an unclean code coverage of 40% will likely translate to approximately 15 – 20% real code coverage, leaving 60% of the WordPress Core code currently completely untested and another 20% or more only tested for the “happy path”. When tied to the enormous amount of breaking changes in PHP 8, this becomes highly concerning. Basically automated testing as a strategy for detecting compatibility issues cannot be relied upon.

Note: a strict code coverage run which only records coverage for those tests with @covers tags in the WordPress test suite, currently shows a 6.8% code coverage, so that can be taken as the absolute minimum.

Testing plugins and themes

Only a small percentage of the available plugins, the more popular and professionally developed ones, have automated tests in place. This is worrisome as generally speaking, the average WordPress site runs about 19 or 20 plugins. Quite a few sites have more plugins running though.

For themes, having automated testing in place, is even more rare.

Allowing for these test suites to run on PHP 8.0 is challenging and that’s even before insight can be gained about the PHP 8 compatibility of these plugins and themes.

However, more than anything, the plugins/themes which have tests are likely the ones where the least amount of PHP 8.0 problems can be expected as they use a professional development model.

The multitude of plugins and themes without tests are much more cause for concern as these will more likely to be problematic when run on PHP 8.

For plugins (and themes), which do have tests, there are primarily two types of tests they may or may not have in place:

Integration tests. These are tests where WordPress itself is loaded before running the test suite and which will use the WordPress core code and integrate with the WordPress test suite.
Unit tests. Stand-alone tests which “mock” WordPress to allow for testing the plugin code, often using popular frameworks like Mockery or BrainMonkey.

Integration tests

As WordPress itself has decided to stick to PHPUnit 7.5, this means for integration tests for plugins and themes, that those will also be bound to PHPUnit 7.5 (maximum).

Plugins and themes will either need to copy the hack in WordPress core to get their integration tests running or alternatively, they need to use the files in WordPress Core, but will then need to create a custom autoloader as they cannot use the same Composer autoload generation hack.

Such a custom autoloader will need to be bootstrapped before the Composer autoload file to prevent the PHPUnit native files from loading anyway.

Unit tests

For unit tests using BrainMonkey or Mockery, PHPUnit > 8 is needed as the Mockery framework (also used by BrainMonkey) available for PHPUnit 7.x is not compatible with PHP 8.0. This means that these test suites will have to be made compatible with PHPUnit 5 up to 9. This also adds another challenge, as when both types of test suites are being used, different versions of PHPUnit are needed to run each test suite.

To aggravate this situation, plugins will generally have a committed composer.lock file to make sure their runtime dependencies are at a certain version they can depend on and which is compatible with PHP 5.6. This last part is often enforced by having a platform: php 5.6 type of configuration in their composer.json file. However, that also means their dev-dependencies, like PHPUnit, BrainMonkey, Mockery, will also be locked at a version which is compatible with PHP 5.6, which will prevent the tests from running on PHP 8.0.

This can be overcome by on-the-fly removing the platform setting and updating the composer.json and composer.lockfiles, but this does make running the tests on PHP 8.0 yet more involved, both in CI, as well as locally for its developers, let alone for outside contributors.

Compatibility of external WordPress dependencies

Part of WordPress compatibility with PHP 8 is also determined by its external PHP dependencies. While the latest nightly of WordPress seems to be compatible with PHP 8 based on the strategies listed above (and with the necessary caveats), this cannot be said for the external dependencies yet.

External dependencies (maintained)

At this time, the PHP 8 compatibility status of external dependencies or WordPress Core, like GetID3, PHPMailer and Requests, is unknown.

GetID3 has precisely 1 test for the unreleased 2.0 branch. This test is not run in CI. Code coverage is basically 0%. A number of PHP 8 related fixes based on static analysis discovery have been pulled and merged, but are not contained in a tagged release yet.
PHPMailer has tests (+/- 73% unclean code coverage), but these aren’t run against PHP 8 yet. Contact has been established with the maintainer of the library and movement on this is expected over the next week or so. So far, one PHP 8 issue has been reported and fixed. The fix is not available yet in a tagged release.
Requests has tests (+/- 90% unclean code coverage). While the tests are being run against PHP 8, they are failing due to the PHPUnit version being used not being compatible with PHP 8. The actual PHP 8 status won’t be known until the test suite is fixed. The project hasn’t had a release since 2016. The maintainer has largely abandoned the project, but is liberal with giving out commit rights to anyone who complains about it. Two of these “complainers” are currently collaborating to at least get a new version tagged, as well as getting the tests properly running on PHP 8, but this is a temporary situation born out of pure desperation and the abandonment of this repo by the maintainer is still a point of concern.
SimplePie has tests, though the code coverage is unknown. The tests are being run on PHP 8. A number of PHP 8 related fixes have been pulled and merged and are included in the 1.5.6 release. WordPress has updated to SimplePie 1.5.6. There is currently one unaddressed test failure on PHP 8.
Random_Compat and Sodium_Compat are not much of a concern. Aside from these being very well tested, the code in these compat layers will not be called when PHP 8.0 is used. All the same, Sodium_Compat has been updated to the latest version to address a PHP 8 issue in the bootstrap.

Based on the above, it can only be hoped for that new releases of most of these external dependencies are forthcoming over the next few weeks/months containing PHP 8.0 compatibility fixes. This also means that WordPress will still need to update these external dependencies once the new releases become available, even though WP has entered the beta stage for WP 5.6 already.

External dependencies (unmaintained)

Other external dependencies, like pomo, PCLZip, Snoopy, RSS, are no longer maintained externally and the burden of making these compatible with PHP 8.0 falls on WP itself.

Most of these, with the exception of pomo, are not tested in the WP testsuite and even explicitly excluded from code coverage and other quality checks, making their PHP 8 status another unknown.

Also, as these external dependencies are “old” code, which has barely seen any maintenance over the past few years aside from fixing select PHP cross-version compatibility issues based on static analysis, this code should be considered PHP 4-code and approached with caution and trepidation for more problems to come.

Impact on broader ecosystem

While each of the above listed issues individually is of small concern, the accumulation of these all together is cause for serious concern about the compatibility of WordPress with PHP 8.0 for the foreseeable future.

To top that up, WordPress is never run as a stand-alone product, but always accompanied by a theme and some plugins. The extensibility of WordPress has been a large factor in its success, but also poses hard to overcome extra challenges in terms of compatibility.

The non-type safe use of broadly relied upon functions like apply_filters() and unprotected global variables will be a pain for years to come, causing fatal errors which will be mis-identified as belonging to Core or plugin B, while caused by plugin A. This is obviously a major cause of concern and is likely to bite a lot of end-users.

Part 3: yoast.com case study

To the point of plugin A causing plugin B or even WordPress core to break, we decided it would be good to run an analysis on yoast.com. In order to get a good overview of the impact PHP 8 would have on a large WordPress site we’ve compiled a list of all PHP warnings currently being thrown on yoast.com that, on PHP 8, would cause fatal errors. It’s important to note that this only covers a small portion of the breaking changes, but it gives a nice indication of the impact these changes could have on a live site.

What did we do?

Via the PHP_FPM module for Elastic beats we logged all PHP warnings and errors straight into our Elastic database. We decided to compile a list of warnings that occured in the last 30 days and filter that list down to the warnings that would raise errors in PHP 8. We got the following list as a result:

Error message	Type	Count
Creating default object from empty value	Theme	266,413
ltrim() expects parameter 1 to be string, array given	WP	131,666
count(): Parameter must be an array or an object that implements Countable	Plugin	7,129
Invalid argument supplied for foreach()	WP	4,685
A non-numeric value encountered	Plugin	3,072
A non-numeric value encountered	Plugin	3,072
A non-numeric value encountered	Plugin	2,981
Division by zero	Plugin	1,288
array_filter() expects parameter 1 to be array, null given	Plugin	830
call_user_func_array() expects parameter 1 to be a valid callback, class ‘HelpScout_Docs_API\Admin’ does not have a method ‘insert_post’	WP	398
Illegal offset type in isset or empty	WP	267
A non-numeric value encountered	Plugin	184
A non-numeric value encountered	Plugin	100
count(): Parameter must be an array or an object that implements Countable	Plugin	97
count(): Parameter must be an array or an object that implements Countable	Theme	53
A non-numeric value encountered	Site	50
trim() expects parameter 1 to be string, array given	WP	38
count(): Parameter must be an array or an object that implements Countable	Plugin	28
A non-numeric value encountered	Plugin	19
Illegal string offset ‘full’	Site	13
A non-numeric value encountered	Plugin	9
Invalid argument supplied for foreach()	Plugin	7
Invalid argument supplied for foreach()	Plugin	4
Illegal string offset ‘usp’	WP	3
count(): Parameter must be an array or an object that implements Countable	Plugin	2
Invalid argument supplied for foreach()	Plugin	2
strlen() expects parameter 1 to be string, array given	Plugin	2
date() expects parameter 2 to be int, string given	Site	2
date() expects parameter 2 to be int, string given	Site	2
A non-numeric value encountered	Plugin	1
array_keys() expects parameter 1 to be array, null given	Plugin	1
Invalid argument supplied for foreach()	Plugin	1

Analysis

As you can see we got a list that ranges from warnings that occur over 250.000 times a month in our theme to warnings that occurred only twice (during PayPal checkouts). In total there were nearly half a million warnings of 32 different types that, on PHP 8, would cause fatal errors.

Warnings coming from anywhere

Of those 32 types of warnings we found 6 that occured in code that’s specific to yoast.com and thus wouldn’t occur on any other site. 20 different types originate directly from 8 different plugins. The remaining 6 occurred in WordPress core code although all of them due to the interaction with plugins. The example of plugin A breaking WordPress core checks out.

From frequent to rare

Of all these warnings, there are two that occur extremely frequently, over 100.000 times in the last month. One originating from our own theme and one due to the esc_url() function being called with an array instead of a string. As we start testing yoast.com on PHP 8 these are likely easily found and just as easily fixed. It won’t take much effort to reproduce these. A simple smoke-test on a staging site would find these errors.

Six types occur less frequently, more than 1.000 times but less than 10.000 times a month. All of these originate from plugins and with 8 million requests served by our servers each month thus occur in less than 1 in 1.000 requests. These are errors that likely wouldn’t all be found during a quick test on a staging environment but would call for much more intensive testing to be reliably caught. Given their frequency it’s likely that these would cause disruption on most sites and if these occur during, for example, AJAX requests they could be much harder to detect. Especially on e-commerce sites users may not report these errors and simply abandon their cart.

Five other types occur between 100 and 1.000 times. All of these come either from plugins or from how plugins are interacting with WordPress core. Given their frequency, it’s unlikely that all of these would be found while testing a site on PHP 8. They are frequent enough that there likely will be users reporting issues eventually but this may take some time due to their rarity. Similar to the issues above if these occur during hidden requests they could be much harder to detect.

Finally we have 19 types of warnings that occur less than 100 times a month. Five of these are specific to yoast.com with the remaining 14 originating from plugins or their interactions with WordPress core. It’s entirely possible that, even with intensive testing, none of these would be found. These are warnings that occur in less than 1 in 100.000 requests, 10 of them even occur less than 1 in 1 million requests. The disruption these would cause is likely to be minimal, although it should be noted that 4 of these occur during checkout and thus would impact a few very specific users more severely. These are errors that due to their rarity may go undetected for months.

In conclusion, when migrating a WordPress site to PHP 8.0 it’s likely that a small number of very frequently occurring issues will be found easily and quickly and fixing these will account for 95% of total errors. With more intensive testing it should be very possible to account for 99% of all errors but based on the data on yoast.com it’s likely that at this point less than half of the different causes will have been found. With the remainder being issues that occur extremely rarely but may happen in some very unfortunate places, such as our checkout.

Things are looking tricky for PHP 8 compatibility on large WordPress sites

By just investigating a subset of breaking changes in PHP 8 we could already confirm this is likely to cause major breakage on sites with unclear origin of that breakage. Oftentimes the error will occur in one place, but is caused by a plugin or theme in a different place, making these issues hard to debug.

Yoast.com is obviously an actively maintained WordPress site, supported by a team of professional developers. The large majority of WordPress sites doesn’t have this luxury and it will not be easy to mitigate compatibility issues on these sites.

Conclusion

PHP 8 is going to contain a lot of breaking changes. We’ve described a subset of those changes in this report; the ones which we assume will have the biggest impact on WordPress and the broader WordPress ecosystem. Those mainly have to do with warnings becoming errors and many many strict type related errors that are being introduced. A rather high portion of these changes can only be detected on runtime.

Fixing these compatibility issues is a major task and to do so you need to use different strategies, ranging from static analysis to automated testing. To do this properly the right tools are required. For projects like WordPress which need to support a range of PHP versions, many extra complexities are introduced in juggling around different versions of the analysis tools. This becomes increasingly hard, partly due to the syntactic and runtime differences between PHP 5.x and 8.x being so incredibly big. This is not to argue whether supporting PHP versions that far back is a good idea or not, the only conclusion here is simply that it becomes increasingly hard to do so.

We’ve also looked at the problem of coverage and WordPress’s PHP dependencies. High test coverage is necessary to reliably detect compatibility and it is even more important with PHP 8, where there are many more compatibility issues than usual and a lot of them can only be detected on runtime. In that sense, it’s hard to say what WordPress core compatibility with PHP 8 truly is, since test coverage is low and virtually absent for dependencies.

Because of PHP 8 focusing so strongly on strict typing, WordPress’ type unsafe extensibility system also becomes extra vulnerable to errors, potentially leading to plugins causing type errors in other plugins or WordPress itself. To put this to the test, we ran an analysis on error data from yoast.com from the last month. As a large site with a shop running on it, we figured it might give a good indication of the types of problems we can expect. Indeed we found many warnings that will turn into errors with PHP 8.

One final note we would like to make, is that WordPress isn’t the only legacy codebase out there and also not the only project which aims to support a wide range of PHP versions. The information in this article might thus well apply to other projects too.

We will not include any further call to action. The primary goal of this post is to inform and create an overview of the problems and challenges related to PHP 8 compatibility in WordPress. We sincerely hope it serves this purpose well.

Updated 11-11: changed the sentence about how many plugins a site runs. It previously said, in error, that that was 30 to a 100. After consulting our tracking data we’ve concluded that the average is in fact 19.3 and thus 19 to 20 plugins.

The post The 2020 WordPress and PHP 8 compatibility report appeared first on Yoast.

Splitting RTL texts into sentences, part 2

Manuel Augustin — Tue, 18 Aug 2020 09:15:12 +0000

In a previous article, I’ve told you the story of how we fixed the problem of sentence tokenization in RTL languages. If it’s only this specific technical problem you were interested in, read no further! I’m not going into more detail about the technical solution. But there’s also a bigger lesson to be learned here, a lesson about how to approach problems, how your hastily formed preconceptions can block you, and how you need to overcome them. If you’re interested in that broader lesson, this article is for you!

The history of the fix

I’ve explained the technical details of the original problem and the solution in a previous article. I’ve also told you that it was a simple fix. But the fact that a fix is simple doesn’t always mean it’s easy to spot. So the part I haven’t told you yet is that it took us a really long time to come up with a solution to the problem. I think there are two reasons for that. First, I was too quick to jump to the conclusion that it was actually a big problem. Second, this was one of those cases where I thought you need to fix the whole world, not just one small problem. (I know this sounds overly dramatic, but it’ll make sense soon!)

In what follows, I want to point out a few mistakes I made when tackling the sentence tokenizer issue. The main issue here wasn’t technical oversight, but more the mindset with which I initially approached the problem.

Challenge your assumptions

For a long time, I thought there was some bigger, underlying issue with the sentence tokenizer. Based on the output of the sentence tokenizer, I always assumed that it was incorrectly reading RTL languages as LTR languages, so from left to right. As I mentioned earlier, I imagined it was some really big problem, potentially not with our custom rules with the tokenizer, but maybe even with the tokenizer library itself or even a general JavaScript problem. I drew these conclusions based on some relatively superficial debugging. As you’ve learned, in the end it turned out that the problem was not some big, horrible bug hidden in the depths of a library or even a whole programming language, but something much more mundane.

The danger of fossilized beliefs

Granted, when I first looked into the problem back in the day, I was really just a rookie, relatively new to our text analysis library as well as programming in general. So it’s not too crazy that I drew the wrong conclusions at the time.

What is crazy though is that this conclusion, once established, really fossilized into a firmly-held belief. “We can’t process RTL languages because there’s some big, underlying problem with sentence tokenization”, I’d repeat to people, because that’s what I actually believed. This way, a personal belief ended up a bit of a myth that also other people around me would believe.

How to challenge your own assumptions

Fast-forward a year or two. I’d love to be able to say, “Oh, I learned a lot in the meantime, took another look at the problem and finally realized that I made a mistake.” Alas, that’s not quite how it went. It took someone with more experience, but more importantly, a less biased view of the issue, to look into it to find out the simple fix that eventually solved the problem.

What can we learn from this story? The obvious lesson is: keep challenging your own assumptions. This statement is very true, but also very broad and therefore hard to put into practice. So here are a few more practical tips for how to apply this lesson to your work as a developer.

Tip 1: be explicit about what you know

The first tip is: be very explicit about what conclusions you’ve drawn based on what evidence. For example, I had done some debugging of the final output of the sentence tokenizer. Based on this, I concluded that the output was incorrect for languages like Arabic and Hebrew. On the other hand, it worked just fine for LTR scripts like the Latin characters we use for English or the Cyrillic characters used in Russian. As a consequence, I thought that writing direction was the problem (which it wasn’t) and that there was some fundamental problem with parsing LTR scripts (also wrong).

Jumping to such unwarranted conclusions probably becomes even more likely when you’re faced with unfamiliar or complex data. For example, I was faced with these scripts that I can’t read. It’s probably the same when you’re facing an unfamiliar file format, programming language, or programming style. The more unfamiliar the material you’re working with, the more cautious you should be.

Writing down clearly what you’ve observed gives you a much better view of the actual facts. It’s not always possible to dive deep into a problem, thoroughly investigate it, and fix it on the spot. But by documenting very clearly what you have investigated – and forgoing any unwarranted conclusions – you create a much more accurate picture of the status quo. That way, it becomes less likely for fossilized beliefs or myths to emerge and take on a life of their own.

Tip 2: get a second pair of eyes

The second tip is: make sure to share your findings with others and let them critically examine your conclusions. If there’s time and capacity, it’s even better if you can ask a colleague, a friend, or another kind-hearted soul to pair-program with you on the debugging process after you’ve established your hypothesis about the problem. It’s easy to fall down a rabbit hole when debugging a technical issue. A second, unbiased pair of eyes is always a great asset, potentially helping you to find other solutions.

So this section was a reminder to keep an open mindset about the conclusions you draw and the assumptions you make when investigating a problem. The next section is about another factor that is likely to cloud your judgment when it comes to avoiding premature conclusions: it’s about fixing problems in an imperfect world.

Solving problems in an imperfect world

This section is about looking for a small fix in an imperfect world. What’s the imperfect world, you might ask? For me, that was how our tokenizer works.

Let me give you a little reminder of its functionality: first, it takes an HTML document and segments both the HTML and the text embedded in it. After that, it puzzles together sentences. It skips some HTML but also leaves in some to be dealt with at a later point. It all works and there probably were some good, practical reasons for why it works the way it does. But as someone looking into it for the first time, I mostly got the impression that we should refactor the whole thing.

I was thinking: wouldn’t it be more elegant and easy to work with if we first took out all the HTML? This would leave us just with a text representation. We could take this “pure” text and tokenize into sentences, and not have to think about HTML anymore. In an ideal world, that’s probably how it would work. But the world doesn’t have to be perfect to make some changes in it.

Why the imperfect world shouldn’t block you

Here’s the point: the fact that the world is imperfect doesn’t mean we can’t fix actual problems within the imperfect world, like the LTR problem. Chances are that the world will never be perfect. Striving for this perfection shouldn’t stop you from fixing real-world problems that you’re facing right now.

Don’t get me wrong, I’m not advocating for never doing any large-scale refactoring. But that should be a conscious choice on its own. You should make this choice by weighing up all the pros and cons it entails. If you can approach it like that, as a separate problem, you’re fine. But that’s not what I was doing. I ended up being mentally blocked by the fact that the status quo didn’t represent an ideal situation. For me, that was a situation where we process HTML first and then go about sentence tokenization. While I still believe that it’s an ideal we should strive for in the future, this shouldn’t block me. It shouldn’t hinder me from implementing what’s needed right now as a concrete improvement for users of our software.

Taken together, my assumptions and preconceptions reinforced each other: on the one hand, I thought we had a big problem that required in-depth changes, on the other hand, I thought we had an imperfect world potentially causing that problem and needing a big revamp anyways. Together, these beliefs made me see one huge task rather than individual problems to be solved one at a time. And that’s not a good way to approach a problem.

Disentangle individual problems

So what lessons can we draw from this? For one, separate your problems and examine and prioritize them one by one. For example, looking in-depth for an RTL fix was one problem. Refactoring the tokenizer to make it easier to deal with was another. These problems might be related, but you shouldn’t assume from the beginning that they necessarily depend on each other. Again, clearly documenting separate issues and outlining their individual scopes can help. You’ll get a better idea of the various problems you’re facing.

Of course, it might turn out to be the case that multiple problems should be solved together. Or the solution to one problem might depend on another. However, these things should always be conscious choices you make based on hard evidence. Also, if you’re working together with other people, there are usually multiple stakeholders involved in that decision-making process. And you, as a developer, should make sure the facts are spelled out as clearly as possible. That way, it’s possible to make sound decisions based on these facts.

Conclusion

In this article, I’ve talked about how, as developers, it’s important to be aware of your own preconceptions and assumptions. I used one problem and its simple, yet long-drawn-out fix as an example. With this example, I’ve shown how perfectionist impulses and premature conclusions can seriously cloud your judgment. My take-home message, therefore, is this: be clear about what you know and what you don’t know, document everything well, let your assumptions be challenged by others, and make sure to let yourself be guided by conscious choices.

The post Splitting RTL texts into sentences, part 2 appeared first on Yoast.

Splitting RTL texts into sentences, part 1

Manuel Augustin — Tue, 18 Aug 2020 09:15:02 +0000

In this article, we’ll outline a solution to a problem we faced when splitting up a text into sentences in RTL languages. These are languages with a right-to-left written script such as Arabic, Farsi, Urdu, and Hebrew. You’ll learn about the technical implementation of sentence tokenization in the Yoast text analysis, and how we expanded this to also cover these RTL languages starting with Arabic in Yoast 14.8. Spoiler alert: it didn’t actually have anything to do with the writing direction! If you’re interested in this specific natural language processing problem, this article is for you!

But wait, there’s more! This article comes with a second part in which we talk about the process behind the search for a solution. So if you also want to improve your practices as a developer – and who wouldn’t? – make sure to also read part 2!

Sentence tokenization – the basics

The Yoast SEO content analysis consists of multiple assessments that give you information about the SEO-friendliness and readability of your post. Many of these assessments operate on sentences. For example, we tell you whether your sentences are too long. Also, when counting keyphrases or transition words, we do that per sentence. This means we need to split texts into sentences, which isn’t as simple to do adequately as it might sound. Yet, for most languages we’ve had this capability since the inception of the Yoast content analysis.

However, we found some issues when looking into expanding our analysis to RTL languages. RTL languages are languages that are written from right to left, such as Arabic, Hebrew, Farsi, and Urdu. When using our existing sentence splitting mechanism, we found that sentences weren’t split correctly. Compare the following example of an LTR script such as the Latin alphabet, which is used to write English, and an RTL script such as Hebrew. For each script, there’s an input text and a tokenized version. Note the incorrect tokenization in the Hebrew text.

Latin alphabet input text:

Lorem ipsum dolor sit amet. Sea an debet conceptam, in sit exerci vidisse, quo at paulo aperiam corrumpit. Ei everti.

Tokenized text:

Lorem ipsum dolor sit amet.
Sea an debet conceptam, in sit exerci vidisse, quo at paulo aperiam corrumpit.
Ei everti.

Hebrew input text:

.נפלו ברית חפש בה. כלל עסקים בקרבת של, והוא האטמוספירה מדע אל, צ’ט תורת הגרפים תקשורת על. זאת מה הארץ

Tokenized text:

1) .נפלו ברית חפש בה. כלל עסקים בקרבת של, והוא האטמוספירה מדע אל, צ’ט תורת הגרפים תקשורת על. זאת מה הארץ

As you can see, the LTR text is split correctly. As we’d expect, sentences are split on all full stops. Looking at the RTL text, this looks a bit different. Now, you might not understand Hebrew – the text in the example isn’t real Hebrew by the way, but a Hebrew equivalent of a meaningless Lorem ipsum text – but you’ll be able to spot some full stops in the original. Just like in an LTR language, sentences should be split on those full stops. So it should look like this:

1) .נפלו ברית חפש בה

2) .כלל עסקים בקרבת של, והוא האטמוספירה מדע אל, צ’ט תורת הגרפים תקשורת על

3) .זאת מה הארץ.

So why did that not work as expected? To answer that question, we’ll first dive a bit into how we split sentences.

Sentence tokenizing in LTR languages

The raw input for our text analysis is an HTML document. This means the problem we need to solve isn’t only how to split a text into sentences, but we also need to separate out all HTML elements from that document.

To achieve this result, we process the text in two rounds: first, we tokenize the whole HTML document. For this, we use the external library tokenizer2. We feed certain rules into that tokenizer that will single out a string as a specific token. Rules are mostly constructed as regular expressions. For example, we have a regular expression that identifies an opening HTML block. The end result of the first round is a tokenized text in which we identify an HTML start token, an HTML end token, or a sentence token. Here’s an example of an HTML-formatted input text and the result of the first round of tokenization for that text:

Input text:

Lorem ipsum dolor sit amet. Sea an www.yoast.com debet conceptam, in sit exerci vidisse, quo at (paulo aperiam) corrumpit? Ei everti.

Tokens:

Results of the first round of sentence tokenization

This representation already resembles something we can work with. We see that the text has been split up into both HTML elements as well as textual elements. From these, we need to puzzle together the sentences that we want to use for our analysis. This happens in the second round of sentence processing.

In the second round, we go over the tokens one by one and decide whether we should include them in sentences. We do this again following a set of rules. For instance, a sentence in its most basic form will be a sentence token starting with a capital letter. To this, we’ll add all following sentence tokens until we again encounter a full stop followed by a sentence token starting with a capital letter. When that happens, a new sentence will be started. With this and some other rules, we arrive a a final result which looks something like this:

Final result of sentence tokenization

Here, we see that sentences have been split as we’d expect. Note for example, that the full stops in the URL aren’t split as sentence starts, as they’re followed by a letter rather than white space. There’s still some HTML within sentences, but that’s something we deal with at a later stage. For the purposes of our analysis, this is suitable material.

So that’s the working sentence processing mechanism for LTR scripts. What about RTL scripts, where we saw that it doesn’t work? We’ll outline this in the next section.

Sentence tokenizing in RTL languages

Now that we’ve seen how correct sentence processing works in LTR languages, let’s return to our problematic RTL example:

Hebrew text:

.נפלו ברית חפש בה. כלל עסקים בקרבת של, והוא האטמוספירה מדע אל, צ’ט תורת הגרפים תקשורת על. זאת מה הארץ

Incorrectly tokenized sentences:

1) .נפלו ברית חפש בה. כלל עסקים בקרבת של, והוא האטמוספירה מדע אל, צ’ט תורת הגרפים תקשורת על. זאת מה הארץ

Looking at the text above (and don’t forget to read right to left!) we see some Hebrew words and some full stops. Based on what we learned about sentence processing for LTR languages, we would expect the text to be split on those full stops. So we’d expect to get a number of split sentences like this:

.נפלו ברית חפש בה
.כלל עסקים בקרבת של, והוא האטמוספירה מדע אל, צ’ט תורת הגרפים תקשורת על
.זאת מה הארץ

Why isn’t that the case? Initially, we had a few hypotheses. It could be that there’s a problem with the sentence tokenizer somehow tokenizing RTL text the wrong way. It could even be that there’s a fundamental problem with JavaScript parsing of RTL text. In the end, the problem – and also the solution – turned out much simpler than that. In fact, it didn’t even have anything to do with writing direction. The answer to the problem was capital letters.

Remember what we mentioned above? A sentence in its most basic form will be a sentence token starting with a capital letter. The reason why this rule isn’t working in RTL languages is simply that Arabic and Hebrew don’t distinguish between capital letters and lower-case letters. So our check for a sentence start, which was a seemingly innocent check for a character that’s different from its lower case form, would always fail. Hence, no new sentence would be started. Instead, all tokens were continuously appended to one single initial sentence.

The fix turned out to be as simple as the problem: we just had to make sure that all letters from the scripts of Arabic, Hebrew, Farsi etc. were also considered as valid sentence beginnings. Of course, we also added extensive tests to make sure that there were no problems with other types of sentences, for example sentences ending in question marks or exclamation marks. We also added a few language-specific sentence endings like a specific Urdu sentence ending dash and an Arabic inverted question mark. And that’s it – no fundamental JavaScript problems to be circumvented, no tokenization library problems to be solved.

Conclusion

In this article, we explained the technical aspects of how we implemented correct sentence tokenization for RTL languages, starting with Arabic in Yoast SEO 14.8. We showed you how we process sentences in general, and how we tweaked this approach to make it work for RTL languages as well. But that’s not quite the end of the story. While the fix we described was indeed simple, it was a long journey to get there. If you want to know about all the pitfalls – mostly in terms of mindset – that had to be overcome to get to this solution, proceed to part 2 of this series!

The post Splitting RTL texts into sentences, part 1 appeared first on Yoast.

Linking suggestions, a look behind the scenes

Hans-Christiaan Braun — Wed, 05 Aug 2020 09:04:07 +0000

The internal linking suggestions tool has been a staple of the Yoast SEO Premium plugin since 2017. After all these years it was time for an overhaul. In a dramatic turn of events, we actually took some lessons from the way search engines work. Ultimately, this led to a little internal search engine of our own, which we’ve released today with Yoast SEO 14.7! This post will take you on a journey behind the scenes.

A search engine?

Yes, indeed! If you look at it, out internal linking suggestion tool works just like a little internal search engine.

First of all, it creates a search query. This search query takes the form of a list of the most prominent words of the post that you are currently writing. Secondly, the tool creates an index, just like google, of the content of your website. It saves the most prominent words of each post, page or term in a separate table in your WordPress database. Last, but not least, the tool uses the search query to search the index for relevant content. This all happens on your own server. In the end, this leads to a list of link suggestions, not unlike a search engine results page!

Do you want to know more? Let us dive in deeper into the inner workings of this little search engine.

A deeper dive

The search query

The first step in the internal linking suggestions algorithm is to summarise the content of your post, page or term. This is more complicated than it may look because text has a natural order to it. If you change the order of two words in a sentence, its meaning change completely.

However, there is an easy solution that has been tried and tested for decades. We just ignore the ordering! This approach is called the bag-of-words approach. Because we do not want the search query to grow very large, we limit the number of words in the query to the 20 most used ones. We call these the prominent words of your post, page or term page.

Adding word form support

Before we create this list of prominent words, however, we apply a linguistic method called stemming. This ‘collapses’ all the different word forms you use in your content to one canonical form. This makes sure that the content within your posts, pages en terms can be compared, even when you use different word forms. For example, the words ‘cats’ and ‘cat’ both collapse to the word form ‘cat’. Both words indicate our feline friends, so it makes sense that they would count as one prominent word.

By the way, for many languages like Spanish, English, German, Indonesian and Swedish, we automatically filter out common function words like ‘the’, ‘one’ and ‘many’. This improves the algorithm even more.

The index

Every search engine needs an index. In our case, this index takes the form of a table in your site’s database. This table connects each post, page and term of your website to a list of its prominent words. More specifically: the table connects each indexable to its prominent words. This makes querying the index fast and efficient.

When you save a post, page or term, its prominent words get added to the index. On top of the words themselves, we also save how often they and their word forms occur in that specific piece of content. This way, we have a way of telling just how prominent these words are. Another way to save the prominent words of your website’s content is to use the internal linking indexing tool on the Yoast SEO tools page of your website. This will compute and save the prominent words of all your publicly available posts, pages and terms that have not been indexed yet in one go. This is useful when you just installed premium on a site, or if the indexing algorithm has been improved.

The search algorithm

Now that we have a representation of the content on your site, as well as the content of the post, page or term you are currently writing, we need to match them together. What we need is a way to compare the prominent words of one post with the other. If they are similar, we can say that their contents are similar as well. With a reasonable amount of confidence, of course.

We can compare two pieces of content with a mathematical concept called a similarity measure. This is a mathematical function that takes two vectors (more on that later) and outputs a number that indicates how similar these two vectors are. A high number would mean that both are very similar. Before we can use a similarity measure, though, we need to transform the lists of prominent words to these nifty vectors. There is a pretty easy way to do this for bag-words-models.

Transforming a bag-of-words model to a vector

A vector is basically a list of numbers. For a similarity measure to work, both vectors need to have the same number of items. For bag-of-words models, this can be done by labeling each position in a vector with a word that can occur on your website. The number located at that position in a vector would be the word’s weight. In this case the weight would be the number of times that word occurs in the content.

For example, the word ‘bear’ may always be tied to position 5. A value of 0 in a specific vector would mean that it does not occur in the post, page or term tied to that vector. A weight of 3 means that it occurs 3 times.

The similarity measure

There are many similarity measures that are useful for text. We decided to use a commonly used one, the cosine similarity measure. The cosine similarity measure has two big advantages over other measures.

The first advantage is that the value is always bounded between 0 and 1. A value of 0 means that both vectors are not at all similar. Whereas a value of 1 means that both vectors are exactly the same. This means that interpreting a similarity score is relatively easy, even for humans.

The second advantage is that it automatically keeps the length of the two vectors into account. In turn, this means that the length of a text does not influence the resulting score. This can be a problem since you naturally use the prominent words more in a longer text. This may skew the suggestions in a bad way, giving more weight to longer texts.

A secret ingredient

Before we can fit the parts together to create the little search engine, we add another ingredient. We multiply the weight of each prominent word with its inverse document frequency. The document frequency of a word is the number of documents (in this case posts, pages and terms) a word occurs.

Why are we interested in a word’s document frequency? Because the more often you use a word, the less useful it is when generating internal linking suggestions. Let us take an example to illustrate this. Let us say that you have, like Yoast, a blog about SEO. The word ‘SEO’ would occur in almost every post, page or term on your blog. The fact that two posts share the word SEO does not add anything useful at all. In fact, it leads to noise in the algorithm, with almost every piece of content matching with every other piece!

To fix this problem, we multiply the prominent word’s weights with the inverse of its document frequency. This is just a fancy way of saying that the more often a word occurs on your site, the less overall weight it gets when generating internal linking suggestions.

Tying it all together

Now that we have all the parts of the search engine, we can tie them together:

You open a post, page or term page.
While writing your awesome content, a list of its most prominent words are calculated.
This list is sent to your website’s server.
The server computes a list of the most similar content.
This list gets sent back to your internet browser.
The internal linking suggestion tool shows you a list of other awesome content that you can link too.
Whenever you save the post, page or term page, its prominent words are saved to your website’s database.

That is it. Now you know how the internal linking suggestions are generated. With this useful tool, you can add links without needing to rummage through your website for useful content to link too. Happy blogging!

Read on: A much-improved internal linking tool: What’s new? »

The post Linking suggestions, a look behind the scenes appeared first on Yoast.

Yoast SEO 14.0.x; or “Why you should never bypass wpdb”

Joost de Valk — Thu, 30 Apr 2020 17:29:27 +0000

Yoast SEO 14.0 was a release that needed quite a few patches so far. We’re sorry about that. In building Yoast SEO 14.0 we made a mistake: we bypassed wpdb. That caused issues and errors for some users when they tried to upgrade, which we’re very sorry about. This post explains the mistakes we made, and our reasoning. I hope it explains a bit of how this happens, and, for the developers among you, prevents you from making these same mistakes.

Let’s start with what we got right, which paves the way for an exciting future:

We started using a custom database table for our metadata. As posts tend to have a lot of SEO metadata, this sped up our frontend performance significantly.
To access that data, we started using an ORM (which stands for Object Relation Mapper). Think of that as a more convenient way of mapping the data you use in your application to the data in the database.
We didn’t remove the data from wp_postmeta, we just copied it to where we could access it most efficiently, but left it around (and even updated it there) for when other people relied on it being there.
The vast majority of websites upgraded with no problems, errors or issues.

Note that all the stuff described below was done because a portion of our users was affected. At the same time, the vast majority was not affected, and in fact, saw an immediate speed improvement on their sites.

Picking our ORM

In choosing our ORM we were severely limited as we had to support PHP 5.6. The problem here is that while WordPress has only recently increased its minimum requirements to PHP 5.6, the rest of the PHP world considers that version ancient history. We would have preferred to use Doctrine, but unfortunately, due to this PHP 5.6 requirement, this was unfeasible. Which is why we ended up with Idiorm. It’s not perfect, but it was still a lot less work than building our own. We also started using Ruckusing for database migrations. This allows us to easily set up and change our database schema.

Idiorm and Ruckusing both had a dependency though: they are built on top of PDO, a way of connecting to the database that is slightly different from the mysqli and mysql interfaces WordPress normally uses in wpdb. And this is where we made the wrong decision: we decided to open a second connection to the database, using PDO, bypassing wpdb.

Releasing 14.0

Timeline of events

As we discovered issues and edge-cases, our team worked tirelessly to diagnose, debug, and release patches. This meant that we launched multiple small updates, which resulted in some users having to (re)index their sites multiple times. We’re sorry about the inconvenience that caused.

April 28th, 08:22 – Released 14.0
April 28th, 17:35 – Released 14.0.1
April 29th, 13:00 – Released 14.0.2
April 30th, 14:04 – Released 14.0.3
April 30th, 17:05 – Released 14.0.4

The full details of each of these releases (and their changelogs) can be found at our plugin page on wordpress.org.

We thought we were ready to release our 14.0 update. We’d tested this extensively, across multiple configurations and systems, and had asked hosts and other plugin developers all over the world to test along with us. We can’t always catch every obscure edge-case, but we weren’t really expecting a lot of problems. Well. We were wrong. Unfortunately, when you have the number of installs we have, even a problem that affects 0.1% of our users becomes an iceberg.

It turns out loads of people have very non-standard DB connections. Ranging from setups where DB_HOST or DB_CHARSET aren’t defined, to larger issues where they are running encrypted database connections, which of course our PDO solution didn’t know about.

We encountered lots and lots of small little issues. We could have fixed all that, potentially, but we’d be spending a lot of time doing that and new issues would inevitably keep popping up. So on Tuesday we made a very hard call. We released version 14.0.1 with fixes for all the things we could fix immediately, and then started working on the bigger and more complex challenges, which became version 14.0.2.

The fix, 14.0.2

So we started working on a better solution: still using Idiorm, but passing it all through wpdb. We’ve forked Idiorm and we’ve “semi”-forked Ruckusing: both now connect to the database through wpdb. This was quite a bit of work, but in hindsight, it wasn’t as bad as we thought it would be. Had we known we could do it “this easily”, we would have immediately made that step and used this approach – but this is new territory; there are few WordPress plugins which extend the platform in this way, at this scale, with this number of unknowns.

The results

The upside of not connecting through wpdb was that it had none of the wpdb overhead. On the other hand, we no longer have to open a connection. In our tests, we see mixed results, but overall, not a whole lot of impact. The added positive side effect is that we can once again inspect our queries through our favorite development plugin, Query Monitor.

Next problem: updates not working as they should

The next problem we ran into was a vexing one: some people were updating to Yoast SEO and they had all the new files, but they were reporting errors with files that no longer existed in the plugin. Note that this does not happen if you update through the WordPress admin. That was baffling. In the end, it turned out that the problem was that some of the WordPress site management tools simply copy files over the old files, instead of removing the directory entirely and replacing it with a directory that contains the new version. This caused errors, but it took a while to diagnose what was going on. That’s because this isn’t how modern hosting or plugin systems should behave, so it never crossed our mind that they would behave in that way. We got there with the kind help of Ipstenu and Otto in the WordPress forums team, though. Note that the upgrade process for plugins is dealt with by WordPress core, not by our plugin, so this was out of our hands. Regardless, we had to find a workaround.

Because of how Ruckusing, which I mentioned above, works, we included files in a directory to run migrations. If the old files in that directory didn’t get deleted, they’d reference classes that we had since removed, and everything breaks. To fix that, we simply renamed that directory, which prevents this from happening for now, and released 14.0.3, this afternoon.

wpdb bites back

Lazy loading

When you update to 14.0 or later, we add a row to our indexables table for every post, page, tag etc. on your site. We can either do this in one go, through our indexing process, or we do it lazily. Every time a URL for which there is no indexable is loaded, we create it, and store it. The next time you open that URL it can immediately take the data from our table and thus be faster.

And then came the final problem, on Thursday 30th April, which caused us to do yet another update. Turns out that when you wpdb::prepare a query which has null in it, wpdb escapes it into 0, which turns out to be a long standing issue in WordPress core. This is typically something we would have caught if we had not fixed this in a patch release, so had we made our decision to use wpdb sooner, we wouldn’t have caught this problem this late. Unfortunately, now it meant that we were storing the wrong values for noindex, nofollow and other robots meta tags in the wrong way. So… We had to do another patch release, 14.0.4 and re-index everybody’s sites to make sure we stored the data right. The re-indexing is possible because we left all the data in wp_postmeta, a decision I’m now very thankful we made.

Is it worth it?

After reading all this, you might think: is this worth it all? And despite the fact that I and a part of our dev team have been working incredibly hard the last 3 days, the answer is still a resounding: yes! We’re laying a foundation for a much better future and we’re making sites instantly faster. I hope we’ve caught all the major issues now, something I of course can’t guarantee, and once again, I’m sorry if we caused you trouble!

The post Yoast SEO 14.0.x; or “Why you should never bypass wpdb” appeared first on Yoast.

Yoast SEO 14.0: WP CLI index command

Joost de Valk — Thu, 23 Apr 2020 09:50:03 +0000

We have added a WP CLI command to Yoast SEO 14.0. In Yoast SEO 14.0 we created new tables in which we combine all the metadata for indexable objects on a site. Please see Omar’s post for the reasoning behind this. Regardless of whether you’re using Yoast SEO in the conventional way or through the new headless functionality, this works the best and the fastest if your site’s meta data is fully indexed. You can make sure of that by running our WP CLI command!

If you don’t know what WP CLI is, you’re missing out! It’s a command line interface for WordPress that makes loads of tasks easier. Read all about it on wp-cli.org.

If you’re using our REST API or our surfaces, running this CLI command on your site before going into production with those is very important.

Syntax

The syntax for this WP CLI command is very simple:

wp yoast index

The output will look something like this:

Indexing posts  100% [==============================] 0:00 / 0:00
Indexing terms  100% [==============================] 0:00 / 0:00
Indexing post type archives  100% [=================] 0:00 / 0:00
Indexing general objects  100% [====================] 0:00 / 0:00

If one of these doesn’t show that is because all of the items in there were already indexed. Authors are indexed as part of the posts indexation process. If you want to test this index command multiple times, please use the Yoast Test Helper. If you hit the “Reset Indexables tables & migrations” button in that plugin, it’ll delete the Indexables tables. After that you can run the process again.

Potential Yoast SEO WP CLI errors

You might get this error:

Error: 'index' is not a registered subcommand of 'yoast'. See 'wp help yoast' for available subcommands.

This means you’re not on the correct version of Yoast SEO. Please note that you need to be on Yoast SEO 14.0 or higher for the Yoast SEO WP CLI command index to work.

Questions? Let us know in the comments!

The post Yoast SEO 14.0: WP CLI index command appeared first on Yoast.

The exciting new technology that is indexables

Omar Reiss — Tue, 21 Apr 2020 09:59:03 +0000

Yoast SEO 14.0 ushered in a new age for the Yoast SEO plugin. In it, we’ve rewritten our entire metadata engine and built an abstraction called indexables. We believe this rewrite will boost WordPress + Yoast SEO as an excellent platform for SEO for many years to come! Please let us walk you through the technological advancements and their meaning.

Think like a search engine

SEO has always been about thinking like a search engine. Search engines want to retrieve as much information as possible and use it to provide searchers with the best possible answers to their questions. This means we constantly have to ask ourselves; how does a search engine treat information?

Information on the web is addressable via URLs. Anything with a URL could be discovered, scraped, indexed, and shown in the search results. WordPress has posts, pages, custom post types, categories, tags, custom taxonomies, different types of archives, special pages, and maybe even more types of content. Do you think a search engine like Google cares about that? It doesn’t. It just looks for things with a URL that can scrape and index.

Better information architecture for WordPress

From an SEO perspective, any type of page in WordPress is simply an indexable object. This is the basic intuition that has led to indexables. At its core, indexables is just a database table that contains metadata and URLs for all indexables on a site. The abstraction normalizes the information architecture for any type of page in WordPress and makes its metadata directly queryable. On top of that, we can now quickly and economically relate different indexable objects to each other and other things, such as links, redirects, attachments, and perhaps even schema markup.

This is a huge deal. Links, for example, are references from one information object with a URL (an indexable) to another with a URL (another indexable). We’ve been storing links for posts in WordPress for quite some time. We can now start doing that for any type of page in WordPress. And by relating them to indexables, we could create a graph of all information on a site inside WordPress. That would enable us to provide users with valuable insights about their site’s SEO.

Direct benefits of indexables

But the direct benefits are great. With Yoast SEO, we seem to be hitting massive performance gains, have dramatically reduced the cost of change for our metadata functionality, and can deliver much nicer and more stable APIs for third-party developers who want to integrate with us. Last but not least, Yoast SEO is now fully ready for headless WordPress. Let’s quickly go over these benefits and how they came about.

Performance gains

WordPress offers standard APIs to store metadata for content resources such as posts and terms. The post meta and term meta APIs implement an entity-attribute-value (EAV) model. To WordPress’s advantage, this is a very flexible and open model that allows any developer to easily add a custom field or piece of metadata to a post or category. However, this model also quickly becomes quite slow to query, especially on big sites with many custom fields. It is part of the reason all SEO plugins tend to slow down a website. They have a ton of metadata to output, all of which needs to be queried separately inefficiently.

Moving to a custom table, we move from EAV to a relational model for fetching SEO metadata, making it much easier and more efficient to query metadata for any page in WordPress. This advantage becomes especially big when you start relating objects to each other. An indexable is now directly mapped to a term or a post. We can now get all of our data in a single query, which makes any request much faster: another big performance gain we get from storing the URL with the indexable!Because of this, breadcrumbs can now be generated in a breeze, whereas in the past, this used to take quite some expensive calculations. In a future release, we also plan to generate our XML sitemaps straight from indexables. This will probably make our XML sitemaps the fastest and most reliable on the web, which is especially good news for large websites, for whom generating XML sitemaps has always been a pain.

Lower cost of change

In Yoast SEO 14.0, we completely rewrote our front-end code and structure. We’ve moved away from a PHP 5.2 compatible procedural style architecture to an object-oriented architecture that makes extended use of all features that become available with higher versions of PHP (5.6+). We’re starting to use namespaces, Symfony’s dependency injection container, and strict separation between pure PHP services and stateful objects like ORM models and value objects.

These tools and strategies have helped us structure our code in a way that makes it much easier to reason about. The benefits of this are legion:

The code becomes much easier to debug.
We can change code more easily because code concepts are better defined, and it’s clearer what part of the code is responsible for what.
The consequences of code changes are easier to oversee, making unforeseen bugs less likely.
The code becomes easy to unit test, which helps prevent future regressions even more.
The time to change or fix something in the code drastically diminishes because of the above. This means we can iterate faster on new features and quickly fix bugs.

We’ve experienced these benefits firsthand. We’ve seen that our development team has had a much easier time understanding the functionality. They even came up with clever and critical questions about it because inconsistencies were much easier to spot. This has already resulted in discovering some small bugs and inconsistencies that we might’ve overlooked. We’ve also seen dramatic productivity improvements, with developers often reaching twice the productivity or more when working on the new code.

Better APIs through surfaces

Figuring out how to offer a consistent API to integrators has been a longstanding challenge. Any code executed within a platform like WordPress shares the same runtime with all other active plugins and themes. Our “public” interface has technically always been available to integrators. Any class, public method, or function could be invoked by third-party code.

Now the question for us has always been: should you use those? We’ve never been fond of plugins relying on our public APIs outside our filters and actions. This would mean we’d have to keep those APIs backward compatible forever to prevent sites from breaking. That would slow us down and would increase the cost of change again. For WordPress, it makes more sense to treat all APIs as public. After all, it is the platform that is running all the code. But how about plugins? Treating any plugin like a standard library you control like any other dependency feels off.

We now have a very elegant pattern for offering developers a sustainable API contract with Yoast plugins. We call it surfaces, and we hope other plugin authors will also start adopting this pattern. A surface is an object we explicitly expose for third-party use and promise to keep backward compatible.

To that end, we’ve introduced a global YoastSEO() function. It has direct access to our DI container and returns a top-level surface, which exposes other surfaces via magic getters (which might, in turn, expose more surfaces). Through surfaces, we can expose pure PHP services, repositories, and factories straight from the container or create more toned-down objects that expose only a subset of the behavior. While globally available, each surface is an object you can inject into your code, making it possible for integrators to keep their code decoupled from Yoast SEO. Go check out our surfaces!

Headless WordPress

Yoast SEO makes generating metadata output for any frontend request easy and efficient. Because of this, we’ve decided to add a simple metadata endpoint that will make it possible for site builders to use Yoast SEO on headless installs. You can now simply fetch the metadata for a page via the REST API and output it anywhere you like.

How to generate indexables for all your pages?

Yoast SEO will work best if it has an indexable stored for every page on your website. The plugin automatically generates them over time, which might not cover everything. So on top of that, we also provide tools to simultaneously generate indexables for the entire site. Here’s what we provide:

Yoast SEO adds or updates an indexable whenever you save content or metadata.
Yoast SEO generates and stores an indexable whenever a page is visited and doesn’t have an indexable yet. This only needs to be done once per page. After that, we’ll use the indexable, and the page load will be much faster. Any page that gets traffic will eventually have an indexable.
We’ve added a tool to the Yoast SEO settings to allow site administrators to add indexables for their entire site at once.
We’ve added a WP CLI command to allow site administrators to add indexables for their entire site via the command line.

A look into the future

What we’re seeing today is truly only the beginning. Believe it or not, the real benefits of indexables are in what we can build on top of it. We can now start delivering features that will give a whole new meaning to sitewide SEO. While not giving too much away here, the first thing we will build on top of this is a completely revised version of our internal linking algorithm. We’ll soon be able to make our internal linking suggestions sitewide and much more accurate by building a search engine into WordPress on top of indexables. Sounds a bit crazy, right? Yet, with indexables and some help from our computational linguists, we can make all this relatively simple. More on that later. Check out the API documentation and stay tuned for the upcoming Yoast SEO release and many more goodies!

The post The exciting new technology that is indexables appeared first on Yoast.

Yoast SEO 14.0: REST API endpoint

Joost de Valk — Tue, 21 Apr 2020 07:54:55 +0000

With Yoast SEO 14.0 we’re introducing lots of new goodies. One of them is a Yoast SEO REST API endpoint that’ll give you all the metadata you need for a specific URL. This will make it very easy for headless WordPress sites to use Yoast SEO for all their SEO meta output.

There are two ways of using this: through its inclusion in the normal WP REST API responses and through our own endpoint.

Inclusion in WP REST API responses

When you’re retrieving a post like so:

https://example.com/wp-json/wp/v2/posts/1

Or like this:

http://example.com/wp-json/wp/v2/posts?slug=hello-world

You’ll receive a normale WP REST API response, with an additional field: yoast_head. This additional field will contain a blob with all the necessary meta tags for that page. This works for the posts, pages, categories, tags and all custom post types and custom taxonomies.

For post type archives, when you query the types endpoint the meta is included there, also on the yoast_head field. If it is not there, the post type does not have a post type archive enabled.

Yoast SEO REST API syntax

The syntax is very simple, you just GET to /wp-json/yoast/v1/get_head?url= with the proper URL, for example:

https://example.com/wp-json/yoast/v1/get_head?url=https://example.com/hello-world/

This will return the following:

{
  "head": "the complete, escaped,  output for Yoast SEO",
  "status": 200,
}

The head contains the complete meta output for the page. This means the Yoast SEO REST API output contains everything:

The title
The meta description, if you have one
Robots meta tags
The canonical URL
Our Schema output
OpenGraph meta data
Twitter meta data

For an example, see this output for a post here on developer.yoast.com.

The API returns `404` for an existing page?

If the status is not 200 but you’re certain the page exists, you’ll need to make sure your site is completely indexed. Just hitting save on the backend should save the post or page to the Indexables database. Note that Yoast SEO will return a head for other statuses too, so you can use the output on, for instance, 404 pages.

If you have a notice in your site’s admin about indexing your site: please run that index process if you intend to rely on this API.

I don’t want this API on my site!

You can easily disable this API by going to SEO – General – Features and disabling the feature toggle.

Headless WordPress and Yoast SEO

With this Yoast SEO REST API endpoint we hope to support those people building headless WordPress sites a lot better. But we’d love to hear from you if you’re currently building headless WordPress sites. Is this useful to you? Would you like to see changes? Let us know in the comments!

The post Yoast SEO 14.0: REST API endpoint appeared first on Yoast.

Yoast Test Helper – easy testing!

Joost de Valk — Mon, 20 Apr 2020 12:04:16 +0000

We’ve released our Yoast Test Helper plugin to WordPress.org. This is a plugin our teams use internally while testing Yoast SEO and it could be very useful for you in testing Yoast SEO on your staging environment. Let me explain the nicest features:

Switching between free and premium
For obvious reasons, this is something we do a fair bit of in development, which is why we’ve made it very easy to do.
Pretty printing Schema output
When you’re working with Schema, it’s very nice if the Schema output is pretty printed, so you can easily see your changes. This happens automatically when you enable the “development mode” in the plugin.
Changing plugin database options
By setting the database option of a plugin to a version lower than the current version, you can trigger the needed option migrations. This allows us to test upgrade paths way more reliably.
Quickly create post types & taxonomies
The Post types & Taxonomies functionality enables a Books and Movies post type plus taxonomies at the click of a button. This can be very useful to quickly test how certain functionality works with custom post types.
Resetting Indexables migrations
Very important in regard to our upcoming Yoast SEO 14.0 Indexables release: by hitting the “Reset Indexables tables & Migrations” button, you can reset the Indexables database. This will cause all our migrations to run again.

There’s a lot more functionality in Yoast Test Helper, so have a play! It looks like this:

Of course, some of this functionality is very tied to the Yoast development team. I don’t expect you for instance to need to test with a different MyYoast install. But we’re sharing all of it in the hope of inspiring more people to share their development and testing tools.

You can find Yoast Test Helper on WordPress.org, or simply search for yoast test helper. Of course you can also find the plugin on GitHub. If you have questions, let us know!

The post Yoast Test Helper – easy testing! appeared first on Yoast.

Upcoming release: Yoast SEO 14.0 – Indexables

Joost de Valk — Mon, 06 Apr 2020 09:52:08 +0000

Team Yoast, while all working from home for the first time in our history, is working on one of the biggest improvements to Yoast SEO yet. This release, Yoast SEO 14.0, internally has the codename “Indexables”. This release, while fully backwards compatible, will change some of our known integration API’s. If you integrate with Yoast SEO you should therefore test with it as soon as possible.

What is changing?

This is basically a full re-build of all our front-end code. We will write blog posts about specific bits of it, but these are the highlights:

We’re moving all of our meta data from wp_postmeta to custom tables. This allows for far more efficient querying.
We’ve moved to using an ORM on top of$wpdb.
Our frontend architecture now uses dependency injection from Symfony.
We’re adding a REST API endpoint for easy integration in headless WordPress sites.
Our Schema API is changing.
We’re opening up a set of new API surfaces to make integration with Yoast SEO a lot easier, you can also more easily add meta tags. Read the documentation if you’re integrating with Yoast SEO, and update your code to be ready for 14.0 if you need to.

How can I test this?

You can find our latest Release Candidate below. You can also check out the release/14.0 branch from our GitHub repository and test your integrations. You’ll find that we add a couple of database tables and that these database tables are filled slowly as you and other people browse your site. The first page load of a page can be slightly slower because of that, after that it’ll be faster. In our testing it looks like the frontend of your site will be anywhere between 5 and 20% faster depending on how many other plugins you run and what your theme is like (the heavier all the other stuff, the lower the impact of our speed improvements).

Download the WordPress SEO 14.0 RC

When is this coming?

Our scheduled release date is Tuesday, April 28th 2020.

Why didn’t I hear of this before?

If you didn’t hear of this before and would have liked to because you’re integrating with Yoast SEO in some way, we’re sorry! We have a special channel on our Slack for integrators, with whom we’ve already shared a zip for testing. Please do reach out to @jjcomack, @Michiel Heijmans or @joostdevalk on the WordPress slack.

The post Upcoming release: Yoast SEO 14.0 – Indexables appeared first on Yoast.

Yoast SEO 14.0: using Yoast SEO surfaces

Joost de Valk — Mon, 06 Apr 2020 09:49:23 +0000

In Yoast SEO 14.0 we’ve introduced a formal way of integrating Yoast SEO into your code. We’ve added what’s called a surface called YoastSEO(). This surface gives easy access to lots of the features Yoast SEO has to offer. In this post I’ll show you some of the different helpers that are now easily accessible if you’re building an integration.

We’ll cover:

Getting SEO data for the current page.
Accessing (and using) our helpers.

Easily access SEO data for the current page

This is probably how most people would’ve integrated with Yoast SEO up till now: running one of our pieces of code and getting the title from it or the description. This was never really “easy”. It always involved getting an instance of our WPSEO_Frontend class, something like this:

WPSEO_Frontend::get_instance();

After that you had to do a bunch of stuff to get to our titles, and we never really made any of that easy. Now you can simply do this to get the title:

$title = YoastSEO()->meta->for_current_page()->title;

Need the description? Just as easy, you can even immediately echo it:

echo YoastSEO()->meta->for_current_page()->description;

The current_page surface exposes every bit of data we have on the current page, which all work in the same way; it’s a long list:

Variable	Type	Description
`$canonical`	`string`	The canonical URL for the current page.
`$description`	`string`	The meta description for the current page, if set.
`$title`	`string`	The SEO title for the current page.
`$id`	`string`	The requested object ID.
`$site_name`	`string`	The site name from the Yoast SEO settings.
`$wordpress_site_name`	`string`	The site name from the WordPress settings.
`$site_url`	`string`	The main URL for the site.
`$company_name`	`string`	The company name from the Knowledge graph settings.
`$company_logo_id`	`int`	The attachment ID for the company logo.
`$site_user_id`	`int`	If the site represents a ‘person’, this is the ID of the accompanying user profile.
`$site_represents`	`string`	Whether the site represents a ‘person’ or a ‘company’.
`$site_represents_reference`	`array\|false`	The schema reference ID for what this site represents.
`$breadcrumbs_enabled`	`bool`	Whether breadcrumbs are enabled or not.
`$schema_page_type`	`string`	The Schema page type.
`$main_schema_id`	`string`	Schema ID that points to the main Schema thing on the page, usually the webpage or article Schema piece.
`$page_type`	`string`	The Schema page type.
`$meta_description`	`string`	The meta description for the current page, if set.
`$robots`	`array`	An array of the robots values set for the current page.
`$googlebot`	`array`	The meta robots values we specifically output for Googlebot on this page.
`$rel_next`	`string`	The next page in the series, if any.
`$rel_prev`	`string`	The previous page in the series, if any.
`$open_graph_enabled`	`bool`	Whether OpenGraph is enabled on this site.
`$open_graph_publisher`	`string`	The OpenGraph publisher reference.
`$open_graph_type`	`string`	The og:type.
`$open_graph_title`	`string`	The og:title.
`$open_graph_description`	`string`	The og:description.
`$open_graph_images`	`array`	The array of images we have for this page.
`$open_graph_url`	`string`	The og:url.
`$open_graph_site_name`	`string`	The og:site_name.
`$open_graph_article_publisher`	`string`	The article:publisher value.
`$open_graph_article_author`	`string`	The article:author value.
`$open_graph_article_published_time`	`string`	The article:published_time value.
`$open_graph_article_modified_time`	`string`	The article:modified_time value.
`$open_graph_locale`	`string`	The og:locale for the current page.
`$open_graph_fb_app_id`	`string`	The Facebook App ID.
`$schema`	`array`	The entire Schema array for the current page.
`$twitter_card`	`string`	The Twitter card type for the current page.
`$twitter_title`	`string`	The Twitter card title for the current page.
`$twitter_description`	`string`	The Twitter card description for the current page.
`$twitter_image`	`string`	The Twitter card image for the current page.
`$twitter_creator`	`string`	The Twitter card author for the current page.
`$twitter_site`	`string`	The Twitter card site reference for the current page.
`$source`	`array`	The source object for most of this page data.
`$breadcrumbs`	`array`	The breadcrumbs array for the current page.

Whether you need the OpenGraph description or the robots array, this has you covered. Get used to opening your favorite IDE, typing YoastSEO()->meta->for_current_page()-> and see the type hints for the exact bit of data you need.

For other pages?

You’ve probably guessed it when you saw the for_current_page() bit. You can do this for other pages and page types too. This would work, if post ID 2 exists:

YoastSEO()->meta->for_post( 2 )->title;

Even better, let’s say you have the URL https://example.com/test-page/:

$title       = YoastSEO()->meta->for_url( 'https://example.com/test-page/' )->title;
$description = YoastSEO()->meta->for_url( 'https://example.com/test-page/' )->description;

If that URL doesn’t exist in the Indexables table it’ll return false.

Easy access to our helpers

Sometimes you need more than just the raw SEO data of a page. For instance, you need to know whether the current post type should be indexable at all. Well, our post_type helper can help with that:

YoastSEO()->helpers->post_type->is_indexable( get_post_type() )

This will return a simple boolean. If you’d rather have a list of indexable post types? You should use:

$public_post_types = YoastSEO()->helpers->post_type->get_public_post_types();

The same works for taxonomies:

YoastSEO()->helpers->taxonomy->is_indexable( 'category' );

There are quite a few of these helpers, and not all of them may be equally useful to you. But do have a look around in your IDE and see which ones we have to offer, this too is a rather large list!

The post Yoast SEO 14.0: using Yoast SEO surfaces appeared first on Yoast.

Yoast SEO 14.0: changing the Yoast Schema API

Joost de Valk — Mon, 06 Apr 2020 09:49:09 +0000

Due to the Yoast SEO indexable project using an entirely new namespaced Dependency Injection architecture, we’ve had to change some of our Schema API. All your existing integrations still work as we’ve made them backwards compatible, but we ask that you please update to the new code when you can. In this post I will explain the changes we made.

Overall, there are four important changes. All your old code still works, as we’ve added full backwards compatibility. However, that code will start throwing deprecation notices. The good news is, these changes are fairly simple and will make life easier in the long run.

We’ve changed the interface from a PHP interface to an abstract class.
Schema IDs for existing pieces should be gotten from a new class Schema_IDs.
We’ve changed how you generate Schema image tags.
If you were extending our existing classes, they have moved.

`WPSEO_Graph_Piece` becomes `Abstract_Schema_Piece`

If you were implementing your own graph pieces and adding them with the , you probably had something like this:

class YoastCon implements \WPSEO_Graph_Piece {

You should replace it with:

use \Yoast\WP\SEO\Generators\Schema\Abstract_Schema_Piece;

class YoastCon extends Abstract_Schema_Piece {

This Abstract_Schema_Piece is an abstract, and not an interface, because it has two public properties: $context and $helpers. These are filled magically and can be used to obtain important data.

`WPSEO_Schema_IDs` becomes `Schema_IDs`

The Schema ID helper class has moved, it’s now called Schema_IDs, full namespace Yoast\WP\SEO\Config\Schema_IDs. This is simply a case of importing that class and then search & replacing all your code, so for instance WPSEO_Schema_IDs::PERSON_LOGO_HASH becomes Schema_IDs::PERSON_LOGO_HASH.

Schema image changes

Instead of doing this:

$image_schema_id = $context->canonical . '#product_image';
$image_helper    = new \WPSEO_Schema_Image( $image_schema_id );
$schema_image    = $image_helper->generate_from_attachment_id( $attachment_id );

You can now simply do this:

$image_schema_id = $this->context->canonical . '#product_image';
$image_helper    = $this->helpers->schema->image->generate_from_attachment_id( $image_schema_id, $attachment_id );

The context, that is now of class Meta_Tags_Context is automatically exposed under $this->context. All the helpers you could need are under $this->helpers. As you can see, generating an image can be done using the Yoast\WP\SEO\Helpers\Schema\Image_Helper class, which is automatically available under $this->helpers->schema->image. You’ll find that if you use a modern IDE like PHPStorm or Visual Studio Code, all of these auto expand for lots of coding convenience.

Note that the Schema ID for the image has become a parameter on the function call instead of on the class constructor.

Classes have moved

We’ve moved classes, according to the table below, and the signature of some of the functions within them have slightly changed, nine times out of ten to get a Meta_Tags_Context parameter.

If you were, for instance, extending WPSEO_Person, that is now called Yoast\WP\SEO\Generators\Schema\Person. Extending Person is as simple as this:

use Yoast\WP\SEO\Generators\Schema\Person;

/**
 * Class Team_Member
 */
class Team_Member extends Person {

The old versus new class names:

Old class name	New class name
`WPSEO_Graph_Piece`	Now an `abstract class` instead of an `interface`: `Abstract_Schema_Piece`
`WPSEO_Schema_Article`	`Article`
`WPSEO_Schema_Author`	`Author`
`WPSEO_Schema_Breadcrumb`	`Breadcrumb`
`WPSEO_Schema_Context`	`Meta_Tags_Context` automatically available under `$this->context` when you extend `Abstract_Schema_Piece`
`WPSEO_Schema_FAQ`	`FAQ`
`WPSEO_Schema_FAQ_Question_List`	Rolled up into `FAQ`
`WPSEO_Schema_FAQ_Questions`	Rolled up into `FAQ`
`WPSEO_Schema_HowTo`	`HowTo`
`WPSEO_Schema_IDs`	`Schema_IDs`
`WPSEO_Schema_Image`	Please use `$this->helpers->schema->image`
`WPSEO_Schema_MainImage`	`Main_Image`
`WPSEO_Schema_Organization`	`Organization`
`WPSEO_Schema_Person`	`Person`
`WPSEO_Schema_WebPage`	`WebPage`
`WPSEO_Schema_Website`	`WebSite`

The post Yoast SEO 14.0: changing the Yoast Schema API appeared first on Yoast.

Developer blog Archive • Yoast

Testing the new Yoast SEO settings interface

Agile approach

Company-wide acceptance testing

Functional testing

User experience (UX)

Accessibility

Performance & security testing

Multi-language support testing

Automated tests

Post-release support and monitoring

How we built the inclusive language analysis in Yoast SEO

Let’s meet the people involved

It all started with lots of research

The next step was the implementation

The challenge of context-dependence

Solutions to address context-dependence

Remaining challenges

Try out the analysis in Yoast SEO

Redesigning the Yoast SEO settings interface

First things first: download the alpha

Release date: January 24th

We focused on the utility belt, but not the armor

Staying true to the WP experience

The jump to Shopify and scaling to other platforms

Looking competitive again

New is always better scary

Our long and winding road forward

Notice: increasing the minimum PHP requirement for Yoast SEO

Background information

Help! I don’t know what to do!!

Why we don’t set the og:image:alt tag

Challenges with alt attributes

One image, many alternative texts

Decorative purpose

Informative purpose

Functional purpose

How does the og:image:alt tag work?

Why we removed the og:image:alt tag in Yoast SEO

Does it really matter?

Facebook

Twitter

Our take on the og:image:alt tag

The feature that almost made it

The feature idea

Security in mind

Test, test, then test some more

Why is Google crawling these? Why is it giving errors?

Conclusion

Behind the front-end: Yoast SEO for Shopify

Technologies

React

Redux and WordPress Data

Tailwind

Platform agnosticism

Configurability

Specifics to Shopify

Extending the Shopify editor

File uploading

Blogs vs. blog posts

Data transformation

Conclusion

Site search with Algolia – improved

Yoast SEO’s link metrics

The meta description & social images

Get this implementation with Yoast SEO Premium 16.7

The 2020 WordPress and PHP 8 compatibility report

Table of contents

Introduction

What’s in this report?

How come there are so many breaking changes in PHP 8?

Isn’t WordPress already compatible with PHP 8?

Part 1: The most worrisome breaking changes in PHP 8

Strict typing on internals in PHP 8

Consistent type errors

Arithmetic operator type checks

Magic methods type checks

Numeric string handling

Named parameters

API changes which could lead to type errors