Machine Learning News Hubb
Advertisement Banner
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us
Machine Learning News Hubb
No Result
View All Result
Home Machine Learning

Do You Know English Grammar Better Than ChatGPT? | by Lev Maximov | Feb, 2023

admin by admin
February 22, 2023
in Machine Learning


Check out how you perform against ChatGPT, DeepL, Grammarly, and QuillBot.

Credit: Aleutie/Getty

It is challenging to make a comprehensive, objective, and easily verifiable comparison between AI proofreading tools. To make an attempt, I selected a short but thorough grammar test and ran it through the best AI grammar checkers I could find.

Take a look at these 20 flashcards (turn the card to see the correct variant) and check how your results compare to the most advanced artificial intelligence based proofreading tools out there.

I tested the following two models from OpenAI.com:

  • ChatGPT, their latest creation, can correct grammar if you write “Correct this to standard English:” and your text below;
  • DaVinci-003, their previous model based on GPT3.5 (also very good at proofreading), accessible via Playground;

I’ve also included:

  • QuillBot, the leader of my last year’s benchmark,
  • Grammarly, an established leader in automated proofreading, both free and paid versions,

Finally, there’s DeepL. While their main product is automated translation between 31 languages, they also have a strong proofreading module (currently in beta).

The results of the test are summarized in the table below:

As has already been revealed in my previous article, ChatGPT has difficulties with “there is/there are” (sentence #13). Other than that, it did perfectly well, being the only one to fix logical errors (#20), and to recognize the appropriate use of the progressive tense for temporary actions or adjectives (#19). It should be mentioned, though, that the top OpenAI models are noticeably slow: it takes 21 seconds for ChatGPT and ~30 seconds for DaVinci to proofread the 20 sentences listed above. A cut-down version of DaVinci called “Curie-001” is faster, but it only scored 3 points, so I didn’t include it in the table.

The only pair of tools that were able to correctly distinguish between “there is” and “it is” (#13) were DeepL and Google Docs.

Surprisingly, DeepL found no problems with #8 (“It is snow”), but in general, it performed quite decently. Although it seems to miss more mistakes than ChatGPT, it works faster (2.5 sec) and its interface is fine-tuned for proofreading: it marks the suggestions (although, in contrast with QuillBot, it does not mark deletions) and provides the alternative versions. It has problems with false positives, though (see “Grammar checking vs paraphrasing” section below for details).

Even the ‘previous’ OpenAI model, DaVinci, managed to outperform tools like Grammarly and QuillBot. DaVinci has a slightly more convenient user interface than ChatGPT for proofreading, although it still does not mark its suggestions, only displaying the corrected version as plain text.

As you can see from the table above, the free version of Grammarly is seriously stripped down from the paid version. It has a nice user interface, but, just like QuillBot, it falls short in comparison to the latest neural networks such as DeepL and ChatGPT.

Google Docs and GrammarCheck performed surprisingly well in this test, scoring 12 points each (I haven’t included GrammarCheck in the table because in the previous tests it performed much worse than QuillBot and Grammarly, so, most probably, this high score is a pure chance). Google Docs was also the fastest one (1.5 sec). ProWritingAid (not in the table, either) scored 6 points; however, it has a lot of other niceties that help writing in more advanced ways than just proofreading.

Most of the tested tools are free to use, but many of them have a rather limited amount of text they can proofread at a time, so a long text needs to be broken into parts to be checked. For example, for DeepL, it is 2000 characters, and for DaVinci, it is about two pages of text (with 3000 characters per page).

ChatGPT does not have such a strict limit, but the more text you include in a single query, the more likely it is to switch from proofreading to summarizing. In this regard, DaVinci is better than ChatGPT: it never switches to summarizing yet its quality is only slightly lower than ChatGPT. As opposed to ChatGPT, which is currently free of charge, DaVinci has a quota; quite relaxed (>2000 pages), but it is still there: anything over quota will require a payment.

QuillBot has two separate pages: “Grammar Checker” and “Paraphraser.” On the “Grammar Checker” page, it tries its best to avoid rewriting the whole sentences and limits itself to the problematic words only (although, obviously, QuillBot is having a hard time with this and sometimes it paraphrases even on the “Grammar Checker” page and even when it is not necessary).

The behavior of ChatGPT depends on the particular wording of the prompt:

  1. “Correct this to standard English.” The default proofreading prompt of the older models (DaVinci, Curie, etc.), looks too ambiguous to ChatGPT: sometimes after proofreading the first couple of paragraphs, it starts paraphrasing the remaining ones.
  2. “Correct spelling and grammar.” With this prompt, ChatGPT tends to proofread much longer chunks of text than with the first one but still resorts to summarizing when the text gets too large. In contrast with the first prompt, the switch to summarizing mode is easily recognizable: it starts returning a single small paragraph instead of the complete text with the corrections.

DeepL Write has no configuration options at all. It will always suggest a mix of correcting grammar and paraphrasing, which, for the most part, defeats its purpose in proofreading, as you can’t tell between the real mistakes and synonym suggestions.

As an example, consider the following passage from “The Hobbit” by J.R.R. Tolkien in which I’ve introduced one glaring grammar mistake. DeepL found a whole bunch of 17 mistakes in it (of which 16 are false positives and one is a real one). Can you spot the real mistake amongst the fake ones?

If you’re pressed for time, here is a hint for you. This is the sentence with the real mistake:

Can you find it now?

QuillBot successfully identified the mistake. In addition to that, it suggested adding three commas here and there, and gave a strange advice to change “neatly brushed” to “nicely brushed” (QuillBot knows better!)

ChatGPT was also unhappy about the commas, and it also successfully corrected the mistake. As a side effect, it suggested the following suspicious change:

ChatGPT knows better!

Google Docs doesn’t really care about the sequence of tenses, so it found no problems with this paragraph except for the rightful recommendation to replace the British ‘woolly’ with an American ‘wooly’ or to switch the locale to British.

The best performer in terms of the number of found mistakes is OpenGPT, but it has certain usability problems when it comes to proofreading:

  • over 10 times slower than DeepL,
  • does not highlight its suggestions,
  • does not provide alternatives for particular words or phrases.

The “Write” feature of DeepL is more about paraphrasing than grammar checking. It has an unacceptable number of “false positives” (when it “corrects” perfectly valid sentences), so it has little practical use in proofreading — even though it does, in fact, find a lot of mistakes.

QuillBot identifies fewer mistakes than DeepL and ChatGPT, but it is still one of the most convenient tools out there as it works fast, highlights the suggested changes, suggests alternatives, and has a small number of “false positives.”

What score did you get on the test? Which neural network has the closest score to yours?

PS. Let’s wait until Google’s Bard goes public and see if it performs better than ChatGPT in this test. Or maybe someone has access to it already?



Source link

Previous Post

5 Ways to Convert PDF to Word

Next Post

Visualized Linear Algebra to Get Started with Machine Learning: Part 1 | by Marcello Politi | Feb, 2023

Next Post

Visualized Linear Algebra to Get Started with Machine Learning: Part 1 | by Marcello Politi | Feb, 2023

ChatGPT, GPT-4, and More Generative AI News

Edge AI and Vision Insights: February 22, 2023 Edition

Related Post

Artificial Intelligence

10 Most Common Yet Confusing Machine Learning Model Names | by Angela Shi | Mar, 2023

by admin
March 26, 2023
Machine Learning

How Machine Learning Will Shape The Future of the Hiring Industry | by unnanu | Mar, 2023

by admin
March 26, 2023
Machine Learning

The Pros & Cons of Accounts Payable Outsourcing

by admin
March 26, 2023
Artificial Intelligence

Best practices for viewing and querying Amazon SageMaker service quota usage

by admin
March 26, 2023
Edge AI

March 2023 Edge AI and Vision Innovation Forum Presentation Videos

by admin
March 26, 2023
Artificial Intelligence

Hierarchical text-conditional image generation with CLIP latents

by admin
March 26, 2023

© 2023 Machine Learning News Hubb All rights reserved.

Use of these names, logos, and brands does not imply endorsement unless specified. By using this site, you agree to the Privacy Policy and Terms & Conditions.

Navigate Site

  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us

Newsletter Sign Up.

No Result
View All Result
  • Home
  • Machine Learning
  • Artificial Intelligence
  • Big Data
  • Deep Learning
  • Edge AI
  • Neural Network
  • Contact Us

© 2023 JNews - Premium WordPress news & magazine theme by Jegtheme.