Reading PDF documents can be boring sometimes. We struggle to find the keywords and important information we need in them. If you’ve ever found yourself in this category, don't worry—I’ve got something for you, so sit back and relax!
In this tutorial, we’ll build a simple PDF summarizer in Next.js using PDF.js, Gemini AI, and Strapi. By the end of this tutorial, you’ll build a PDF summarizer and learn how to use the different tools you would be using to build it.
To follow through this tutorial, the following prerequisites should be met:
Before setting up and installing the tools needed to build, it’s important to familiarize yourself with each tool's purpose.
Let's take a glance at each of them:
Next.js: This would be used to build the frontend with Tailwind CSS.
PDF.js: This open-source JavaScript library allows web browsers to render portable document format (PDF) files directly in the browser without additional plugins. For our project, we’ll use it to extract the data from the uploaded PDF, which Gemini will summarize.
Gemini Ai: A powerful and versatile large language model(LLM) created by Google. For our project, we’ll use it to summarize the PDF parsed by Pdf.js, leveraging Gemini AI prompts to generate summaries quickly and accurately.
To use Gemini AI for sentiment analysis, you need an API key. Head to the get an API key section. Click on create API
to create your API:
Save the API somewhere because you'll need it when you start creating code.
To get the CDN links for Pdf.js, navigate to the cdn.js site and copy the following following links:
https://cdnjs.cloudflare.com/ajax/libs/pdf.js/2.6.347/pdf.min.js
https://cdnjs.cloudflare.com/ajax/libs/pdf.js/2.6.347/pdf.worker.min.js
https://cdnjs.cloudflare.com/ajax/libs/pdf.js/2.6.347/pdf_viewer.min.css
That’s all you need from the site for Pdf.js.
You need to have Strapi installed on your local machine. Strapi v5 will be used for this project. Navigate to the folder you want your project installed in the terminal and run this command:
npx create-strapi-app@rc my-project --quickstart
Replace my-project
with the actual project name you intend on using.
This command will create a new project and install Strapi on your local machine. The –quickstart
ensures your application starts immediately after installation.
You’ll need to sign up to access your Strapi dashboard. After you’re done signing up, you should have a dashboard like this:
The next step to setting up Strapi is to create a collection type. Navigate to Content-Type Builder on the side navigation bar of your dashboard and click on Create new collection type. Give your new collection type a display name. Ensure it’s singular, not plural, as Strapi automatically pluralizes it. I’ll be naming mine SummarizedPDF
; go ahead and give yours a unique name.
Click the "Continue" button. This will take you to the next step: selecting the fields appropriate for your collection type. For the SummarizedPDF
collection type, you need to create two fields representing:
text,
a short text. I’ll name mine as simply Title
.
Click on "add another field" to add the next field.
text,
but this time, it will be long. I’ll name mine simply Summary.
Click on "finish" and then save the collection type. Your collection type should look like this:
After creating your collection type, head over to Settings > USERS & PERMISSIONS PLUGIN > Roles, and click on Public. Then scroll down to Permissions, click on Summarized-Pdf, select all to authorize the application's activity, and click Save.
So we've finished setting up Strapi. Let us now set up and develop the frontend.
To install Next.js, navigate to the project folder again through the terminal. Run the following command:
npx create-next-app@latest
Give your app a name, and following the instructions, select your options:
✔ Would you like to use TypeScript? … No / Yes
✔ Would you like to use ESLint? … No / Yes
✔ Would you like to use Tailwind CSS? … No / Yes
✔ Would you like to use `src/` directory? … No / Yes
✔ Would you like to use App Router? (recommended) … No / Yes
✔ Would you like to customize the default import alias (@/*)? … No / Yes
For this project, we won’t use typescript or the src
directory(we'll use the app router).
After the installation, change to the folder directory and start the application:
cd your-app
npm run dev
Now you’ve got everything set up, so let’s start building.
You’ll need to install two packages:
google/generative-ai: This package interacts with Google's generative AI models. It allows you to access and utilize these models in your applications, enabling you to build AI-powered. For this project, it will be used to interact with Gemini for our summary.
react-markdown: React Markdownis a library that allows you to render Markdown content in your React applications. For this project, it will be used to render the summarized contents since there would be times Gemini AI prompts would give an output in markdown format.
To install these packages, run the following command:
npm install google/generative-ai react-markdown
For the Pdf.js part, you need to add the scripts copied earlier into the layout file. Inside the layout.js
, replace the return statement with this.
1return (
2 <html lang="en">
3 <head>
4 <script
5 type="module"
6 src="https://cdnjs.cloudflare.com/ajax/libs/pdf.js/4.3.136/pdf.min.mjs"
7 ></script>
8 <script
9 type="module"
10 src="https://cdnjs.cloudflare.com/ajax/libs/pdf.js/4.3.136/pdf_viewer.min.css"
11 ></script>
12 <script
13 src="https://cdnjs.cloudflare.com/ajax/libs/pdf.js/4.3.136/pdf.worker.mjs"
14 type="module"
15 ></script>
16 </head>
17 <body className={inter.className}>{children}</body>
18 </html>
19);
While everything remains the same, we only added the scripts to it.
The pdf.min.mjs
script will provide the core functionality for parsing and displaying the PDFs. The pdf_viewer.min.css
script will define the styles for the PDF viewer component while the pdf.worker.mjs
script will allow background operations for PDF processing.
Navigate to your page.js
file and replace it with the following code:
1"use client";
2import { useState, useEffect, useRef } from "react";
3import ReactMarkdown from "react-markdown";
4
5
6export default function Home() {
7 const [summary, setSummary] = useState("");
8 const [title, setTitle] = useState("");
9 const [file, setFile] = useState(null);
10 const [showSummary, setShowSummary] = useState(false);
11 const [isLoading, setIsLoading] = useState(false);
12 const fileInputRef = useRef(null);
13
14 function onFileChange(event) {
15 setFile(event.target.files[0]);
16 }
17
18 function handleTitleChange(event) {
19 setTitle(event.target.value);
20 }
21
22 async function handleShowSummary() {
23 if (file && title) {
24 setIsLoading(true);
25 const fileReader = new FileReader();
26 fileReader.onload = async (event) => {
27 const typedarray = new Uint8Array(event.target.result);
28 const pdf = await pdfjsLib.getDocument({ data: typedarray }).promise;
29 console.log("loaded pdf:", pdf.numPages);
30
31 let text = "";
32
33 for (let pageNum = 1; pageNum <= pdf.numPages; pageNum++) {
34 const page = await pdf.getPage(pageNum);
35 const content = await page.getTextContent();
36 content.items.forEach((item) => {
37 text += item.str + " ";
38 });
39 }
40
41 sendToAPI(text);
42 };
43 fileReader.readAsArrayBuffer(file);
44 }
45 }
46
47 function sendToAPI(text) {
48 console.log("Sending title to API:", title);
49 fetch("/api/", {
50 method: "POST",
51 headers: {
52 "Content-Type": "application/json",
53 },
54 body: JSON.stringify({ text, title }),
55 })
56 .then((response) => {
57 if (!response.ok) {
58 throw new Error("Network response was not ok " + response.statusText);
59 }
60 return response.json();
61 })
62 .then((data) => {
63 if (data.success) {
64 setSummary(data.Summary);
65 setShowSummary(true);
66 } else {
67 throw new Error(data.message || "Unknown error occurred");
68 }
69 })
70 .catch((error) => {
71 console.error("There was a problem with the fetch operation:", error);
72 })
73 .finally(() => {
74 setIsLoading(false);
75 });
76 }
77
78 function handleClear() {
79 setSummary("");
80 setTitle("");
81 setFile(null);
82 fileInputRef.current.value = null;
83 setShowSummary(false);
84 }
85
86 return (
87 <div className="min-h-screen flex flex-col items-center justify-center bg-[#32324d]">
88 <div className="p-8 rounded shadow-md w-full max-w-md mb-6 border border-gray-600 bg-[#32324d]">
89 <h1 className="text-2xl font-bold mb-4 text-center text-white">
90 Upload PDF
91 </h1>
92 <form className="space-y-4">
93 <div>
94 <label
95 htmlFor="title"
96 className="block text-sm font-medium text-gray-200"
97 >
98 Enter Title
99 </label>
100 <input
101 id="title"
102 type="text"
103 placeholder="Enter Title"
104 value={title}
105 onChange={handleTitleChange}
106 className="border border-gray-500 rounded p-2 mt-1 w-full text-white bg-gray-700"
107 required
108 />
109 </div>
110 <div>
111 <label
112 htmlFor="file"
113 className="block text-sm font-medium text-gray-200"
114 >
115 Upload PDF
116 </label>
117 <input
118 id="file"
119 type="file"
120 name="file"
121 accept=".pdf"
122 onChange={onFileChange}
123 ref={fileInputRef}
124 className="border border-gray-500 text-white rounded p-2 mt-1 w-full bg-gray-700"
125 required
126 />
127 </div>
128 <div className="flex justify-between items-center">
129 <button
130 type="button"
131 onClick={handleShowSummary}
132 className="text-white px-4 py-2 rounded hover:opacity-90 bg-[#4945ff]"
133 disabled={!file || !title || isLoading}
134 >
135 Show Summary
136 </button>
137 <button
138 type="button"
139 onClick={handleClear}
140 className="bg-red-600 text-white px-4 py-2 rounded hover:bg-red-500"
141 >
142 Clear
143 </button>
144 </div>
145 </form>
146
147 {isLoading && <p className="text-yellow-300 mt-4">Summarizing...</p>}
148
149 {showSummary && summary && (
150 <div className="mt-6 p-4 border rounded border-gray-600 bg-gray-700">
151 <h2 className="text-xl text-gray-100 font-semibold mb-2">
152 {title}
153 </h2>
154 <ReactMarkdown className="text-gray-200">{summary}</ReactMarkdown>
155 </div>
156 )}
157 </div>
158 </div>
159 );
160}
So, aside from importing the packages needed, here’s a breakdown of what the code does:
State Management For the state management, the code does the following:
Summary
: Stores the PDF's summarized content.title
: Saves the title of the PDF provided by the user.file
: Contains the uploaded PDF file.showSummary
: A boolean indicating whether the summary should be displayed.isLoading
: A boolean indicating whether the summarization process is currently underway.fileInputRef
: A reference to the file input element, which allows for direct manipulation of the field.Event Handlers
onFileChange
: This function updates the file state with the PDF file that the user has selected.handleTitleChange
: This function updates the title state with the value entered by the user.
handleShowSummary()
: This function reads the uploaded PDF, extracts the text content, and sends that content and the title to an API for summarizing. It uses FileReader
to read the file as an ArrayBuffer
and pdfjsLib
to extract text from each page of the PDF.
After extracting the text, it sends the data to an API endpoint(which we’ll create next) via a POST
request. The response from the API, which contains the summary, is then stored in the summary state and displayed on the page.
handleClear()
: This function resets the form by clearing the state variables i.e summary, title, and file as well as the file input field for a new input.
The JSX structure defines a form where the user can input a title and upload a PDF. It also comprises buttons to either generate a summary or clear the form.
If you check the result in your browser, you should see something like this:
Looking nice, right?
Create a .env
file in the root folder and add your Gemini API key.
API_KEY=YOUR_GEMINI_API_KEY
Now, let’s create the endpoint to receive the extracted PDF, summarize it, store it in the summary
state, and display it on the page. This is also where we’ll need Gemini and Google generative AI.
Next, inside the app
folder, create a folder called api
and a file called route.js
inside it. Add the following code:
1import { NextResponse } from "next/server";
2import { GoogleGenerativeAI } from "@google/generative-ai";
3
4
5const genAI = new GoogleGenerativeAI(process.env.API_KEY);
6
7
8const model = genAI.getGenerativeModel({ model: "gemini-pro" });
9
10
11export async function POST(req) {
12 try {
13 const body = await req.json();
14 console.log("Received title:", body.title);
15 console.log("Received text length:", body.text.length);
16
17
18 if (!body.title) {
19 throw new Error("No title provided");
20 }
21
22
23 const prompt = "summarize the following extracted texts: " + body.text;
24 const result = await model.generateContent(prompt);
25 const summaryText = result.response.text();
26
27
28
29
30 console.log("Summary generated successfully");
31
32 const strapiRes = await fetch("http://localhost:1337/api/summarized-pdfs", {
33 method: "POST",
34 headers: {
35 "Content-Type": "application/json",
36 },
37 body: JSON.stringify({
38 data: {
39 Title: body.title,
40 Summary: summaryText,
41 },
42 }),
43 });
44
45
46 if (!strapiRes.ok) {
47 const errorText = await strapiRes.text();
48 console.error("Strapi error response:", errorText);
49 throw new Error(
50 `Failed to store summary in Strapi: ${strapiRes.status} ${strapiRes.statusText}`
51 );
52 }
53
54
55 const strapiData = await strapiRes.json();
56 console.log("Successfully stored in Strapi:", strapiData);
57
58
59 return NextResponse.json({
60 success: true,
61 message: "Text summarized and stored successfully",
62 Summary: summaryText,
63 });
64 } catch (error) {
65 console.error("Error in API route:", error);
66 return NextResponse.json(
67 {
68 success: false,
69 message: "Error processing request",
70 error: error.message,
71 },
72 { status: 500 }
73 );
74 }
75}import { NextResponse } from "next/server";
76import { GoogleGenerativeAI } from "@google/generative-ai";
77
78const genAI = new GoogleGenerativeAI(process.env.API_KEY);
79
80const model = genAI.getGenerativeModel({ model: "gemini-pro" });
81
82export async function POST(req) {
83 try {
84 const body = await req.json();
85 console.log("Received title:", body.title);
86 console.log("Received text length:", body.text.length);
87
88 if (!body.title) {
89 throw new Error("No title provided");
90 }
91
92 const prompt = "summarize the following extracted texts: " + body.text;
93 const result = await model.generateContent(prompt);
94 const summaryText = result.response.text();
95
96 console.log("Summary generated successfully");
97
98 const strapiRes = await fetch("http://localhost:1337/api/summarized-pdfs", {
99 method: "POST",
100 headers: {
101 "Content-Type": "application/json",
102 },
103 body: JSON.stringify({
104 data: {
105 Title: body.title,
106 Summary: summaryText,
107 },
108 }),
109 });
110
111 if (!strapiRes.ok) {
112 const errorText = await strapiRes.text();
113 console.error("Strapi error response:", errorText);
114 throw new Error(
115 `Failed to store summary in Strapi: ${strapiRes.status} ${strapiRes.statusText}`,
116 );
117 }
118
119 const strapiData = await strapiRes.json();
120 console.log("Successfully stored in Strapi:", strapiData);
121
122 return NextResponse.json({
123 success: true,
124 message: "Text summarized and stored successfully",
125 Summary: summaryText,
126 });
127 } catch (error) {
128 console.error("Error in API route:", error);
129 return NextResponse.json(
130 {
131 success: false,
132 message: "Error processing request",
133 error: error.message,
134 },
135 { status: 500 },
136 );
137 }
138}
So here’s the breakdown of the code:
POST Request Handling
The POST
function handles incoming POST
requests asynchronously, it parses the request body as JSON to access the text and title from the client, and logs the received title and text length for debugging.
Input Validation
A validation check ensures that a title is provided. If no title is provided, an error will be thrown, which will be caught and handled later.
Summarize Text
We create a prompt that instructs us to summarize the extracted texts. The generateContent()
method is then used to generate a summary extracted from the AI response.
Store the summary in Strapi
The code makes a POST
to Strapi's /api/summary-pdfs
. This contains the summarized content and title. The request will then be sent to your Strapi backend. If the request to Strapi fails, an error is reported, and an error is thrown, which is caught and handled later.
Let's test the app to see if it works:
You can see it summarizes the PDF. It is also added to your Strapi backend:
We’ve accomplished summarizing a PDF and storing the summarized content in Strapi! That’s huge!
Now, you can choose to stop here or continue with me by creating a table where you can view all your summarized PDFs without having to navigate to the Strapi backend. Let’s try to add that in the next section.
To do this, you’ll need to use the link component. Go back to your page.js
and import it into the page:
1import Link from "next/link";
Below the div
created for displaying the summarized PDF, add the following:
1<div className="w-full max-w-md text-center">
2 <Link
3 href="/summaries"
4 className="bg-green-600 text-white px-6 py-2 rounded hover:bg-green-500 transition-colors inline-block"
5 >
6 View Summarized PDFs
7 </Link>
8</div>;
Now, let’s create the endpoint. Inside the app
folder, create a folder called summaries
Inside the folder, you’ll first create a file called page.js
.
Inside the file, add the following code:
1"use client";
2import { useState, useEffect } from "react";
3import Link from "next/link";
4import ReactMarkdown from "react-markdown";
5
6export default function Summaries() {
7 const [summaries, setSummaries] = useState([]);
8 const [isLoading, setIsLoading] = useState(true);
9 const [error, setError] = useState(null);
10
11 useEffect(() => {
12 fetchSummaries();
13 }, []);
14
15 const fetchSummaries = async () => {
16 try {
17 const response = await fetch("http://localhost:1337/api/summarized-pdfs");
18 if (!response.ok) {
19 throw new Error("Failed to fetch summaries");
20 }
21 const data = await response.json();
22 console.log("Fetched data:", data);
23 setSummaries(data.data || []);
24 setIsLoading(false);
25 } catch (error) {
26 console.error("Fetch error:", error);
27 setError(error.message);
28 setIsLoading(false);
29 }
30 };
31
32 if (isLoading) return <div className="text-white">Loading...</div>;
33 if (error) return <div className="text-white">Error: {error}</div>;
34
35 return (
36 <div className="min-h-screen bg-[#32324d] py-8 text-white">
37 <div className="max-w-4xl mx-auto">
38 <h1 className="text-3xl font-bold mb-8 text-center">Summarized PDFs</h1>
39 <Link
40 href="/"
41 className="bg-[#4945ff] text-white px-4 py-2 rounded mb-4 inline-block"
42 >
43 Back to Upload
44 </Link>
45 {summaries.length === 0 ? (
46 <p>No summaries available.</p>
47 ) : (
48 <table className="min-w-full bg-gray-800 border-collapse">
49 <thead>
50 <tr>
51 <th className="border border-gray-600 px-4 py-2">ID</th>
52 <th className="border border-gray-600 px-4 py-2">Title</th>
53 <th className="border border-gray-600 px-4 py-2">Short Text</th>
54 <th className="border border-gray-600 px-4 py-2">View</th>
55 </tr>
56 </thead>
57 <tbody>
58 {summaries.map((summary) => (
59 <tr key={summary.id} className="hover:bg-gray-700">
60 <td className="border border-gray-600 px-4 py-2">
61 {summary.id}
62 </td>
63 <td className="border border-gray-600 px-4 py-2">
64 {summary.Title}
65 </td>
66 <td className="border border-gray-600 px-4 py-2">
67 <ReactMarkdown className="prose prose-invert max-w-none">
68 {typeof summary.Summary === "string"
69 ? summary.Summary.slice(0, 100) + "..."
70 : "Summary not available"}
71 </ReactMarkdown>
72 </td>
73 <td className="border border-gray-600 px-4 py-2">
74 <Link
75 href={`/summaries/${summary.id}`}
76 className="bg-[#4945ff] text-white px-4 py-2 rounded"
77 >
78 View
79 </Link>
80 </td>
81 </tr>
82 ))}
83 </tbody>
84 </table>
85 )}
86 </div>
87 </div>
88 );
89}
In the code above, we created a React component that displays a list of summarized PDFs fetched from a Strapi backend. It renders a table with summary details, including an ID, title, and a shortened version of the summary content.
The href={/summaries/${summary.id}}
in the "View" button dynamically generates a URL based on the id of each summary. This allows you to click the "View" button and navigate to a page to view the specific summarized PDF for that id
.
If you click the view summarized button, it should redirect you to this page:
To manage dynamic routing for each summary based on its id
, you must create a folder named [id]
and two files inside the app/summaries
directory. The first to create is the page.js
. After creating it, add the following code:
1import { Suspense } from "react";
2import Link from "next/link";
3import SummaryContent from "./SummaryContent";
4
5export default function SummaryPage({ params }) {
6 return (
7 <div className="min-h-screen bg-[#32324d] py-8 text-white">
8 <div className="max-w-4xl mx-auto px-4">
9 <Link
10 href="/summaries"
11 className="bg-[#4945ff] text-white px-4 py-2 rounded mb-4 inline-block"
12 >
13 Back to Summaries
14 </Link>
15
16 <Suspense fallback={<div>Loading...</div>}>
17 <SummaryContent id={params.id} />
18 </Suspense>
19 </div>
20 </div>
21 );
22}
In the code above, we created a component responsible for displaying the detailed view of a summarized PDF based on its id
.
The Suspense
component displays a fallback loading message (<div>Loading...</div>)
while SummaryContent
is fetched. The SummaryContent
component( which we'll create shortly) is passed the id
from params.id
, corresponding to the specific summary being viewed.
Now, let's create the second page. Still, inside the [id]
folder, create a file called SummaryContent.js
and add the following code:
1"use client";
2
3import { useState, useEffect } from "react";
4import ReactMarkdown from "react-markdown";
5
6export default function SummaryContent({ id }) {
7 const [summary, setSummary] = useState(null);
8 const [isLoading, setIsLoading] = useState(true);
9 const [error, setError] = useState(null);
10
11 useEffect(() => {
12 const fetchSummary = async () => {
13 try {
14 const response = await fetch(
15 `http://localhost:1337/api/summarized-pdfs?filters[id][$eq]=${id}`,
16 );
17 if (!response.ok) {
18 throw new Error("Failed to fetch summary");
19 }
20 const data = await response.json();
21 if (data.data && data.data.length > 0) {
22 setSummary(data.data[0]);
23 } else {
24 throw new Error("Summary not found");
25 }
26 setIsLoading(false);
27 } catch (error) {
28 console.error("Fetch error:", error);
29 setError(error.message);
30 setIsLoading(false);
31 }
32 };
33
34 fetchSummary();
35 }, [id]);
36
37 if (isLoading) return <div>Loading...</div>;
38 if (error) return <div>Error: {error}</div>;
39 if (!summary) return <div>Summary not found</div>;
40
41 return (
42 <>
43 <h1 className="text-3xl font-bold mb-4">{summary.Title}</h1>
44 <div className="bg-gray-800 p-6 rounded-lg">
45 <ReactMarkdown className="prose prose-invert max-w-none">
46 {summary.Summary}
47 </ReactMarkdown>
48 </div>
49 </>
50 );
51}
The SummaryContent
component fetches and displays a specific summarized PDF based on the provided id
.
Now let’s check the result in the browser to see if it works:
It works! Our PDF summarizer is complete! Here’s the link to the code on GitHub.
That’s How to Create a PDF Summarizer.
In this tutorial, we learned how to build a PDF summarizer in Next.js using Pdf.js, Gemini, and Strapi. You can also choose to enhance yours by adding other features too. There are quite a lot of things you can build using AI tools and Strapi.
Love to see what you can build. Please share if you found this tutorial helpful.
I'm a web developer and writer. I love to share my experiences and things I've learned through writing.