How to Transcribe a Phone Call Recording
June 3, 2026
Transcribing a phone call recording is the easy part once you have the audio file. The harder part comes earlier: getting a clean recording of both sides, and doing it within the law where you are. This guide covers the legal question, where call recordings actually come from, how to turn one into a speaker-labeled transcript, and how to handle a recording that often holds something you would not want anyone else to read.
Calls are some of the most sensitive audio people keep. A recorded negotiation, a doctor’s call, an HR conversation, a quote from a source: the contents tend to be more private than almost anything you would type. That sensitivity is the thread running through the whole guide, and it is the reason a privacy-first, upload-based tool fits this job better than one that quietly holds onto your files.
What you need
There is not much to set up.
- A recording of the call, as an audio file on a device that has a browser. MP3 and M4A are the common formats from call recorder apps, and both work without conversion.
- A modern browser. Nothing to install, on the phone or the computer, because transcription runs in the browser and in the cloud.
- An account, but only when you are ready to transcribe the full file. The 30-second preview needs no account at all.
A short word on cost, because honesty about it is part of the point. Hushscript is not free, since it runs on top-tier transcription AI that carries a real per-minute cost. It is pay-as-you-go with no subscription, and you get 30 free minutes to try it. Those minutes arrive instantly when you validate a card with a $1 hold that is authorized and then released right away, never charged. If you would rather pay another way, the 30 minutes are granted once after your first minutes purchase, and a card is not required. The pack prices are on the pricing page.
Recording a call legally
Know the law before you record. Recording rules differ by country, by state or province, and sometimes by the type of call, so the safe move is to check the rule that applies to you rather than assume. The distinction that matters most is between one-party and all-party consent.
Under one-party consent, you can record a call you are part of without telling the other person, because you are one party and you consent. Under all-party consent, sometimes called two-party consent, every person on the call has to know it is being recorded and agree to it. Some jurisdictions follow one rule, some the other, and a call that crosses a border can be subject to both ends.
The simplest way to stay clear of trouble is to say it out loud. A line like “I’m recording this call for my notes, is that alright?” covers you in an all-party jurisdiction and is rarely a problem in practice. If the recording is for something legally sensitive, such as a dispute or anything that might end up as evidence, confirm the specific rule that applies first. This guide is informational, not legal advice.
Where call recordings come from
The phone’s operating system shapes what is even possible, so the right approach depends on your device.
On iPhone, iOS does not support native call recording, which is why this is the most asked-about case. The usual workaround is a third-party app such as TapeACall or Rev Call Recorder. These typically work by merging your call into a three-way conference with a recording line, so both sides land in the file. They generally save as M4A.
On Android, the built-in Google Phone app records calls natively in some regions, writing the recording straight to the device. Where that is unavailable, apps like Cube ACR work on most handsets by tapping the phone’s audio or using the speakerphone. Output is usually M4A or MP3.
For a landline, a VoIP call, or a desktop call, recording is often cleaner. A dedicated recording adapter handles a landline, and VoIP and conferencing platforms such as Google Meet, Skype, and Zoom have their own recording feature that saves a file you can download afterward.
A voicemail counts too. Save or export the message as an audio file, by forwarding it, using your carrier’s export option, or recording the playback, and you can transcribe it like any other recording.
Whatever the source, you end up with a single audio file. From here the process is the same for all of them.
Transcribe the recording in three steps
The flow is built so you can hear the quality before you commit anything.
- Drop the file and watch the preview. Go to voice recording to text and drag your recording onto the upload area, or click to browse. Hushscript transcribes the first 30 seconds and shows it back to you with speaker labels. No account, no payment, nothing to fill in. This is where you check that both sides are audible and that the two voices are being separated the way you expect.
- Sign up to transcribe the rest. If the preview looks right, enter your email to create an account. This step is what enables full transcription, and it is where your 30 free minutes come from. Files up to 10 hours or 2 GB are accepted, with no daily caps.
- Get, relabel, and export the transcript. Upload the full recording and let it process. When it finishes, the transcript appears in your dashboard with the speakers marked. Rename “Speaker A” to a real name in a click, then export as TXT, SRT, DOCX, or JSON.
The moment your transcript is ready, the audio is deleted from the server. There is no setting to keep it, and it is never used to train anything. You keep the transcript; the recording is gone.
A worked example: a two-sided call
Say you recorded a 12-minute call with a vendor about a contract renewal, both of you audible, saved by your call recorder app as renewal-call.m4a. You drop it onto the preview, the first 30 seconds come back with two labels, and the separation looks clean. You sign up, transcribe the whole thing, rename the speakers, and a minute later export a TXT that reads like this:
Speaker A 00:00:08 Hi Sarah, thanks for calling back. I wanted to go
over the renewal terms before we sign anything.
Speaker B 00:00:14 Of course. I've got the contract open. Where do you
want to start?
Speaker A 00:00:19 The termination clause. The current 90-day notice
feels long for us.
Speaker B 00:00:26 That's fair. We can look at 60 days, but it would
change the discount tier slightly.
Each turn carries a speaker tag and a timestamp. That matters on a call more than on most recordings, because the value of a call transcript is usually a record of what was agreed and who agreed to it. Finding the exact line someone committed to a figure is a matter of scanning the timestamps, not scrubbing back and forth through audio. If you export SRT instead, you get the same content cut into timed caption blocks.
Two-sided calls, two speakers
A call is the case where speaker separation earns its keep. When both parties are audible in one file, Hushscript separates them automatically, mapping Speaker A and Speaker B to the two voices, so the transcript reads as the back-and-forth it actually was rather than one merged block.
To put real names in, click any speaker label, type the name, and every line attributed to that speaker updates across the whole transcript. You only do it twice on a two-person call. For more on how the labelling is produced and how it copes with similar-sounding voices, speaker identification goes deeper.
A few things shape how well the separation works:
- Both sides in one track is what most call recorder apps produce, a stereo or mono mix carrying both parties. Speaker separation works cleanly on it.
- One side only happens when the recorder captured just the device microphone, so your voice is clear and the other person is faint or missing. This produces a one-sided transcript with a single speaker to label. If it happens, set the app to capture both the earpiece and the microphone, or use speakerphone next time so the phone’s mic hears both people.
- Separate files per side is rarer, but if you ended up with two mono recordings, one per person, merge them into a single file before uploading. Any audio editor will do that, or transcribe each side as its own file and combine the two transcripts afterward.
Keeping call recordings private
Phone calls hold personal, legal, and business-sensitive things, which is why how a tool treats the file matters more here than for, say, a podcast you are about to publish anyway. Three things protect a call recording with Hushscript.
First, it is upload-based and nothing else. Hushscript does not connect to any phone system or calling platform. The recording is on your device, and you upload it when you are ready. There is no live tap, no bot joining the call, no integration that sees your calls in the background.
Second, the audio is deleted the moment the transcript is ready. For a legal consultation or an HR call, that is the point: the recording does not linger in storage after it has done its one job. What you are left with is the transcript, not a copy of the audio.
Third, the transcript that remains is encrypted at rest. If our storage were ever leaked, the contents would surface as unreadable ciphertext rather than your words. That is leak protection, stated plainly. It is not a claim that no one on our side can ever read a transcript, because the key is held on the server, not by you alone. The honest version is the useful one: a breach exposes ciphertext, and you can delete any transcript yourself in one click. The fuller account of how the pipeline handles your audio is on the private transcription page.
If the call was actually a video call, on FaceTime, WhatsApp, or Zoom with the camera on, the audio is extracted from the video file in your browser before anything uploads. The video itself stays on your device.
Troubleshooting common issues
Most call recordings transcribe without fuss. When the result is off, the cause is usually one of these, and most are fixable.
Only one speaker shows up
This is the most common call-recording problem, and it means the file captured only your side. The transcript is accurate, but there is one voice. The fix is at the recording end: choose a recorder that captures both the earpiece and the microphone, or put the call on speakerphone so both people are audible to the phone’s mic.
The audio is compressed or noisy
Calls carry artifacts that face-to-face recordings do not. Phone and VoIP codecs compress heavily, and many duck the volume when both people talk at once. Accuracy is usually a little lower than an in-person recording as a result, so plan to skim proper nouns, numbers, and any quiet stretches against the audio rather than trusting them blind.
Strong accents or fast speech
Accents are handled well, but a heavy accent over fast, overlapping speech is the hardest case for any transcription engine, and call compression makes it harder. Expect the occasional swapped word. Names and technical terms are the usual casualties, so proofread those first.
Two people talking over each other
When voices overlap, the boundary between speakers blurs and a few words can land under the wrong tag. There is no perfect fix for crosstalk in a recording that already exists. If you control the next call, simply not talking over each other does more for the transcript than any setting.
The file is large or long
Long calls are fine, up to 10 hours or 2 GB per file with no daily limit, but a big file takes longer to process. You do not need to wait on the page; start it, leave, and collect the transcript from your dashboard. If a file refuses to upload, the cause is usually a flaky connection rather than the file, and retrying on a stable network normally clears it.
An unusual file format
If your recorder produced AMR, OGG, or something other than MP3 or M4A, drop it as-is first. You do not need to convert anything. If a format genuinely is not accepted, email support@hushscript.com and we will add it.
Preview first, or just transcribe?
The 30-second preview is free and needs no account, so the rule is simple. When the audio quality is in any doubt, and a recorded call usually is, preview first. In 30 seconds you will see whether both sides separate and whether the words are landing, before you spend any of your minutes. If you already know the recording is clean, both parties clear and close to the mic, you can skip straight to signing up and transcribing the whole call.
Accuracy tips for call recordings
A few habits raise the quality of a call transcript more than any single setting.
- Capture both sides in one file. It is the single biggest factor for a usable call transcript. A recorder that taps both the earpiece and the mic, or a conference-style recording line, beats a one-sided capture every time.
- Record somewhere quiet. Background noise on either end costs you words. A quiet room on your side is the part you can control.
- Let the preview vet the file. It is the cheapest accuracy check you have: free, no account, 30 seconds, before you commit any minutes.
- Proofread proper nouns and numbers first. On a call about a contract or a schedule, the figures and names are what you cannot afford to get wrong, and they are exactly where errors cluster.
After transcription
A call transcript is mostly useful as a record. Export it as DOCX for a formatted document with attributed speaker lines, the kind of thing you would file after a negotiation or attach to a follow-up email. The timestamps let you point to exactly when something was said, which matters for a dispute or any record where timing counts. And a phone interview transcribes the same way as an in-person one, so the workflow carries straight over.
Many call recordings are M4A, and if you are working specifically with Apple Voice Memos or recorder apps that write that format, how to convert M4A to text covers the same ground with the format-specific detail. For multi-speaker recordings in general, how to transcribe an interview walks through the same speaker-separation workflow with a research angle.