A large language model fine-tuned for coding tasks.
So I had to put it to the test against ChatGPT!
Here are my findings [THREAD]:
✍️ Before we start:
- This is by no means a conclusive/thorough study. This was done for fun testing different LeetCode coding questions (you can try on your own to practice!) just to see how they would do.
- I’ll be using ChatGPT with GPT-3.5 and Code Llama Instruct - 34B through Perplexity.
- Most of the time Perplexity was printing the code with no indentations, I added the indentations manually.
- They might do okay if you ask them a second time or express the question differently. However, I just wanted to test them in a single prompt with no variations.
QUESTION 1
“Use Python. You are given two strings word1 and word2. Merge the strings by adding letters in alternating order, starting with word1. If a string is longer than the other, append the additional letters onto the end of the merged string.
“Use Python. Given a string s, reverse only all the vowels in the string and return it.
The vowels are 'a', 'e', 'i', 'o', and 'u', and they can appear in both lower and upper cases, more than once.
Example 1:
Input: s = "hello"
Output: "holle"
🟢 ChatGPT: +1
🔵 Code Llama: 0
QUESTION 3
“Use Python. Given an integer array nums, move all 0's to the end of it while maintaining the relative order of the non-zero elements.
Note that you must do this in-place without making a copy of the array.
Example 1:
Input: nums = [0,1,0,3,12]
Output: [1,3,12,0,0]”
🟢 ChatGPT: +1
🔵 Code Llama: 0
QUESTION 4
“Use Python. You have a long flowerbed in which some of the plots are planted, and some are not. However, flowers cannot be planted in adjacent plots.
Given an integer array flowerbed containing 0's and 1's, where 0 means empty and 1 means not empty, and an integer n, return true if n new flowers can be planted in the flowerbed without violating the no-adjacent-flowers rule and false otherwise.
Example 1:
Input: flowerbed = [1,0,0,0,1], n = 1
Output: true
Example 2:
Input: flowerbed = [1,0,0,0,1], n = 2
Output: false”
🟢 ChatGPT: +1
🔵 Code Llama: +1
QUESTION 5
“Use Python. Given an input string s, reverse the order of the words.
A word is defined as a sequence of non-space characters. The words in s will be separated by at least one space.
Return a string of the words in reverse order concatenated by a single space.
Note that s may contain leading or trailing spaces or multiple spaces between two words. The returned string should only have a single space separating the words. Do not include any extra spaces.
Example 1:
Input: s = "the sky is blue"
Output: "blue is sky the"”
🟢 ChatGPT: +1
🔵 Code Llama: +1
QUESTION 6
“Use Python. Given a string s and an integer k, return the maximum number of vowel letters in any substring of s with length k.
Vowel letters in English are 'a', 'e', 'i', 'o', and 'u'.
Example 1:
Input: s = "leetcode", k = 3
Output: 2
Explanation: "lee", "eet" and "ode" contain 2 vowels.”
🟢 ChatGPT: +1
🔵 Code Llama: +1
QUESTION 7
“Use Python. You are given a string s, which contains stars *.
In one operation, you can:
Choose a star in s.
Remove the closest non-star character to its left, as well as remove the star itself.
Return the string after all stars have been removed.
Example 1:
Input: s = "leet**cod*e"
Output: "lecoe"”
🟢 ChatGPT: +1
🔵 Code Llama: 0
QUESTION 8
“Use Python. Given an array of integers temperatures represents the daily temperatures, return an array answer such that answer[i] is the number of days you have to wait after the ith day to get a warmer temperature. If there is no future day for which this is possible, keep answer[i] == 0 instead.
Many people are talking about Claude being a better option than ChatGPT.
So I decided to put them to the test!
- Reasoning
- Simple math
- Coding
- Creativity & more
Here are my findings:
✍️ Before we start:
- This is by no means a conclusive/thorough study. This was done for fun testing different small questions just to see how they would do.
- I’ll be using ChatGPT with GPT-4 (let’s call it ChatGPT+)
- I didn’t add here the questions that both got correct, which were A LOT (more numbers later).
- Some of these models might do okay if you ask them a second time or express the question differently. However, I just wanted to test them in a single prompt with no variations.