How to Generate Human-Like Text with Chat-GPT

by Davide Guaglianone | Jan 31, 2023 | How to | 2 comments

Reading Time: 12 minutes

Recently, our team started using Chat-GPT to generate ideas for copywriting and content creation. I was interested in its capabilities and started trying it out, I must admit I’m not an expert in this regard, but I found some interesting information. However, like any new tool, there is a learning curve. In this article, I will share my experience and tips on how to generate human-like text with Chat-GPT.

Table of Contents

If you’re like me and have been tasked with creating content for your company’s website or marketing materials, you know how time-consuming and challenging it can be to come up with unique and engaging ideas. That’s where Chat-GPT comes in.

Perplexity and burstiness: Key concepts in understanding the human-like generated text

Exploring the world of language models like Chat-GPT, I understood the importance of comprehending perplexity and burstiness. So, what exactly are these concepts, and why do they matter?

Perplexity is a metric that evaluates how well a language model can predict a given sample text. In essence, it’s a way to assess the randomness and naturalness of the text generated by the model.

Burstiness, on the other hand, is a phenomenon that occurs when certain words or phrases appear more frequently within a given context than would be expected by chance alone. In natural language, burstiness is a common feature, as specific words or topics tend to cluster together in conversations or written texts. For instance, when discussing a particular subject like “climate change,” related terms such as “global warming,” “carbon emissions,” and “greenhouse gases” are likely to appear more frequently within that context.

In the realm of language models, accounting for burstiness is essential to produce more realistic and contextually appropriate text. If a model fails to capture the burstiness inherent in natural language, the generated text may appear unnatural or disjointed. Ideally, a language model should be able to identify and replicate the burstiness present in the human-generated text, leading to more coherent and contextually relevant outputs.

These two metrics combined are often used to evaluate a text and guess if it’s been generated by a human or a language model. The theory is that a low-perplexity text with moderate burstiness is likely to be human-generated since it fits the theoretical parameters of the human writing style, but this is actually not true, at least for the first part.

Perplexity is simply the evaluation of how well a certain language model could predict and replicate a given text, but a low or high perplexity depends on the language model used for the comparison since it is the measurement of how well that specific language would be able to predict the text. Therefore, the same text could have high perplexity when compared to a small language model and low perplexity when compared to a big language model. The point when producing content is not to have high or low perplexity, but rather about producing cohesive, coherent, and high-quality text.

Limitations of AI-generated content detection tools like GPTZero

Talking about perplexity, burstiness, and human-like text, I must also mention GPTZero.

GPTZero, although not the only player in the field, has most certainly become one of the most renowned and widely used AI content detection tools. Its rise to prominence follows the rise of Chat-GPT, which has recently gathered a good amount of attention.

It is essential to recognize that, just as Chat-GPT or any other language model may occasionally generate content that seems plausible but is, in fact, fabricated or inaccurate, GPTZero can also make mistakes in discerning AI-generated text from human-written content. Both Chat-GPT and GPTZero rely on fallible language models, and consequently, their outputs are not immune to errors.

To put things into perspective, I once conducted a personal experiment where I submitted a text written by me to GPTZero, only to see it flagged as AI-generated content. On the other hand, when I submitted a text generated by Chat-GPT to the platform, it was identified as human-generated. The AI-generated text was the following:

The concisely flummoxed otter jubilantly inaugurated an avant-garde symposium of interdisciplinary salutations, defying the conventional expectations of its astonished spectators. Perambulating with unanticipated sagacity, the aquatic mustelid proceeded to expound on the intricacies of quantum chromodynamics, eliciting a mélange of bewilderment and admiration from the erudite assembly.

I would never use such a writing style and yet, the response was: Your text is likely to be written entirely by a human

This anecdote does not imply by any means that GPTZero is entirely unreliable; rather, it serves as a reminder that, just as we acknowledge Chat-GPT’s potential for creating misleading content, we should also remain cognizant of the limitations inherent in AI-checkers like GPTZero.

Customizing input commands to improve AI text quality

There are different methods to write content and none of them is intrinsically right or wrong, it depends on what you want to achieve and convey with your content. You could use a list of parameters for Chat-GPT to follow or you can describe what you want to achieve in a more abstract way.

A good starting point is the “Act as…” command, which is a command that asks Chat-GPT to act like a specific profession and therefore use all the cunning of the job to create content of the best quality possible. For example, you could ask Chat-GPT to “act as a content writer” to generate an article, and by doing so Chat-GPT would keep in mind all the writing guidelines about punctuation, text formatting, etc…when generating the text.

This “act as…” is an adaptable command, you can ask Chat-GPT to act as very different professions, here is a list of some possibilities:

Content writer: An all-round content creator for various mediums, including websites, blogs, social media, and marketing materials.

Copywriter: Focuses on creating engaging and persuasive content for advertising, marketing, or sales materials.

Technical writer: Specializes in creating detailed and accurate documentation for technical products, services, or processes.

Blogger: Writes informal, conversational, and opinionated articles for websites or online platforms.

Journalist: Reports on news, events, or developments with a focus on accuracy, objectivity, and timeliness.

Scriptwriter: Crafts narratives, dialogues, and scenes for various forms of media, including movies, television, or theater.

Academic writer: Produces research papers, dissertations, or other scholarly works based on rigorous methodologies and evidence.

Grant writer: Develops proposals for funding or resources from government agencies, foundations, or other organizations.

Speechwriter: Composes speeches or presentations for public figures, politicians, or business leaders.

Social media writer: Creates engaging and shareable content for various social media platforms, often in shorter formats.

Furthermore, you can specify the nuances you want Chat-GPT to keep in mind by adding some adjectives or definitions. For example, you could ask to act as an informative and concise content writer, rather than a simple content writer to create brief yet rich-of-information articles. It’s worth noting that the adjectives added to the profession and the tone set for the content can contradict each other.

Fine-tuning the input command

There are many more parameters to consider when writing the input command, some of them require a description to convey the sense of what you are trying to achieve while others can make use of a “True” or “False” to state if you want that parameter to take place or not. Some of them have more impact than others on the outcome but when put together they can tailor the content close to perfection.

Here is a list of the parameters with an example value for each of them:

Tone: The tone of voice of the content, you can go for something like “conversational, engaging, friendly”. It doesn’t have to be a single word

Complexity: The general complexity of content, generally speaking, possible values are: high, medium, and low.

Target audience: The target audience of your content, you can give a plain description like “people interested in using Chat-GPT for content writing”

Table of content: Unless you need something specific, you can use “True” as a value here.

Key takeaway: “Highlighted at the end of the article to summarize the main points” is what I usually use, feel free to change the definition but be sure to explain the purpose of this parameter.

Rhetorical questions: “Integrated strategically to emphasize key points and engage readers”. This allows the use of rhetorical questions only if they can add something to the sentences

Figures of speech: “Employed selectively to captivate the reader and underscore essential points” to make sure that each figure of speech used has a purpose

Idiomatic expressions: “Used sparingly and fittingly to inject personality and convey ideas effectively”. Depending on the context you might want to include more or less idiomatic expressions, keep in mind that too many of them can make it difficult to understand for non-native speakers.

Structure: This is the structure of the article. You can find a list of possible structures here: ChatGPT Content Structures

Number of paragraphs: Generally speaking, between 5 and 9 paragraphs are enough. The number you choose will lead Chat-GPT to write more or less content about the topic to fit the number of paragraphs.

Language model: The language model you want to use. For general uses GPT-3 175B or GPT-3 345B are both great. You can find a list of all the language models we tested and recommend you here: ChatGPT language models for fine-tuning.

Alternatively, you can check the complete list with more than 160.000 language models constantly updated here https://huggingface.co/models

Total length: The total length of your content, it’s better to use a range instead of a specific number. Like “1200-1500 words”, by doing so the language model will have more flexibility.

Topic: The topic of the content, could be the title of the article you want to write or a broader topic.

Changes for readability: “Continuously made, focusing on clarity, flow, and a seamless reading experience” is what I generally use. This parameter makes sure that the language model focuses attention on the readability of the content and not only on listing information.

Examples: You can use “True” as a value or a description like “Relevant examples and case studies that support the article’s recommendations”. This will enable real-world examples in your content, making it more natural.

Base for the article: [Labeled information], you can use previously labeled information as a base for the article. In case you missed our guide on information labeling you can find it here: Information labeling guide

Call to Action (CTA): Not a mandatory parameter, you can use “ Incorporating a clear and persuasive CTA”

Citations and references: “Providing credible sources” in the case of an article can be extremely useful.

Content Formatting:

Use bullet points for summarizing key points or listing items
Use numbered lists for step-by-step instructions or ranking items
Bold text for important phrases or headings
Italic text for emphasizing specific words, quotes, or titles
Blockquotes for highlighting quotes or excerpts from other sources
Headers (H1, H2, H3, etc.) for organizing content into sections and sub-sections
White space to break up large chunks of text and improve readability

The use of this parameter is optional and customizable but it really helps the overall readability.

Voice and Style: Specify your company’s brand voice and style guide.

Visual Aids: “Suggesting suitable visual elements (images, graphs, charts) to complement the content” will help you in case your content needs a chart or an image here and there. You never know.

Chat-GPT prompt: A working example

Once you have done all the preparation, you should have an input command that looks like this:

Act as a content writer, use the following parameters to write an article:

Tone: Informative, engaging
Complexity: Low.
Target audience: People interested in AI
Table of content: True
Key takeaway: Highlighted at the end of the article to summarize the main points
Rhetorical questions: Integrated strategically to emphasize key points and engage readers
Figures of speech: Employed selectively to captivate the reader and underscore essential points
Idiomatic expressions: Used sparingly and fittingly to inject personality and convey ideas effectively
Structure: Explanatory structure
Number of paragraphs: 8
Language model: GPT-3 345B
Total length: 2000-2500 words.
Topic: Uses for AI in everyday life
Changes for readability: Continuously made, focusing on clarity, flow, and a seamless reading experience
Examples: Relevant examples and case studies that support the article’s recommendations
Citations and references: Providing credible sources
Content Formatting:
– Use bullet points for summarizing key points or listing items
– Use numbered lists for step-by-step instructions or ranking items
– Bold text for important phrases or headings
– Italic text for emphasizing specific words, quotes, or titles
– Blockquotes for highlighting quotes or excerpts from other sources
– Headers (H1, H2, H3, etc.) for organizing content into sections and sub-sections
– White space to break up large chunks of text and improve readability
Visual Aids: Suggesting suitable visual elements (images, graphs, charts) to complement the content

You don’t have to use each parameter every time, just like I did in this example, you can omit unnecessary ones based on your needs.

The perk of using this structure for the input command is that you can quickly customize the parameters. Each row is a different parament and it’s easier to change each of them since you only have to rewrite a little portion of the text every time, leaving the whole structure intact.

Tokens: The Chat-GPT’s currency

A token is a unit of language that a computer program, like an AI language model, uses to process and understand the text. In simple terms, a token is like a word, a punctuation mark, or a part of a word that the program treats as a separate entity. The number of tokens used in a question and its corresponding answer must fit within the maximum limit of tokens that can be processed by the AI language model.

This means that if you use too many tokens (words) in your input command, you might not leave enough space for the response. You need to strike the right balance between length and amount of details. Every language model has its tokens limit, for example, the language models of the T5 family tend to have around 2096 tokens available, while models like GPT-3 have their limit set at 4096.

The input command I used in the last example uses around 290-300 tokens, depending on the exact used words, leaving enough space for a response with pretty much every language model you would choose.

Text transformation and rewriting

Although Chat-GPT can generate content on a wide range of topics, its power does not lie only in its text-generating capabilities. It is possible to transform existing text to have it fit other parameters, by doing so you can generate a text and then change it producing different versions of it instead of writing everything from scratch each time. This can come in handy when you are exploring new ideas and want to leave every door open.

Simple text transformation
For a simple text transformation, you can paste the content you want Chat-GPT to transform into the chat, enclosed between quotation marks, and then right below in the same message add this:

Using [name of language model], please transform the text to make it more human-like by including figures of speech, idiomatic expressions, rhetorical questions, and anecdotes. Aim to create text that is similar in quality to the high-quality human-generated text. Avoid using repetitive patterns and improve the readability of the text to make it engaging and informative

Of course, you need to substitute [name of language model] with the name of the language model you want to use. Unless you have some specific needs, you could use a language model of the T5 family (which stands for Text-to-Text Transfer Transformer) for this task, specifically T5-11B would probably fit your needs to perfection.

Text transformation with a change of analogy/metaphor
Sometimes you also feel like the metaphor used is not quite fitting, maybe because the tone is too conversational or maybe does not convey what you have in mind. Not a problem, in a case like this, following the same process you could use this input command:

Using [name of language model], please transform the text to make it more human-like by including figures of speech, idiomatic expressions, rhetorical questions, and anecdotes. Aim to create text that is similar in quality to the high-quality human-generated text. Avoid using repetitive patterns and improve the readability of the text to make it engaging and informative. Additionally, please change the used metaphor or analogy to make the text feel more natural and less machine-generated

For this task, you could still use a language model from the T5 family like T5-11B, which is a transformer, or some of the GPT family, which are generative since you want to include new possible metaphors and analogies.

To further elaborate on a specific paragraph of the table of content
It happened to me more than once, the article that Chat-GPT has produced is well-written and ticks all the boxes but some paragraph needs some work. You just need to work around a single paragraph and not the entire article. When this happens, this is what you can use:

Use [name of language model] and elaborate more on the [Nth] paragraph of the Table of Content, focusing on improving the clarity, flow, and readability of the paragraph. Use specific examples and case studies to support your points, and avoid repeating information already covered in previous paragraphs. Incorporate rhetorical questions, figures of speech, and idiomatic expressions where appropriate to engage the reader. Use an informative, authoritative, and engaging tone, suitable for [target audience] but do not mention them. Discuss only the topic of the paragraph. State the paragraph’s name

Feel free to customize the parameters as you see fit, this is just an example. In this case, you want to substitute:

the [name of language model] with the name of your language model of choice
the [Nth] with the number of the paragraph you want to work on
the [target audience] with the audience you are aiming at.

The need of specifying the target audience again comes from the fact that Chat-GPT has not elaborated enough on the given topic, hence the further elaboration, and since you want to elaborate more on the topic you might want to specify again the audience to tailor the content.

For this task, there is no specific language model to use. It entirely depends on what you want to achieve. If you tend towards some simple transformation and less text generation you can use the T5 family or the CTRL family. If you want something more specific you can use the RoBERTa family or the GPT family for something more creative.

Troubleshooting

It will happen, from time to time, that Chat-GPT interrupts the text generation, no matter the input some issues could still come up. Most likely you stumbled upon one of these three events:

You used some input commands that contradict the “Act as…”. Like, you asked it to act as a doctor and simultaneously asked it to not use any medical terms.
It reached the limit of words it can generate in a single response. In this case, I simply use the command: “Repeat the last sentence and keep going”. This forces the AI to repeat the last sentence used so I can check that it didn’t cut off anything.
The AI is writing about a topic it was not trained on. This is quite rare and it happened to me just once or twice. In this case, use a different language model. Often, you can pick another language model of the same family.

Conclusion

As someone who is not a seasoned pro in either the field of machine learning or copywriting, I can assure you that this article is by no means an exhaustive examination of every single case and issue related to Chat-GPT. However, I do hope that by sharing my personal experiences and insights, I can help someone else achieve even greater results.

The key to success when working with Chat-GPT, in my opinion, is to constantly keep experimenting and seeking out new ways to use the technology. I found that by taking things one step at a time and gradually building upon my understanding of the tool, I was able to make steady progress.

More about how to use Chat-GPT on Maximizing Chat-GPT’s Potential: A Guide to Its Hidden Labeling Feature

← The Importance of Accessible Design Measuring the Success of Your Product-Led Growth Strategy →

5 2 votes

Article Rating

2 Comments

Most Voted

Newest Oldest

Inline Feedbacks

View all comments