Skip to content

vibe-coded JSON downloaded: stackexchange-cli [options] <url1> [url2]...#8

Open
drzraf wants to merge 3 commits intojagreetdg:mainfrom
drzraf:vibe-cli
Open

vibe-coded JSON downloaded: stackexchange-cli [options] <url1> [url2]...#8
drzraf wants to merge 3 commits intojagreetdg:mainfrom
drzraf:vibe-cli

Conversation

@drzraf
Copy link
Copy Markdown

@drzraf drzraf commented Apr 6, 2026

Because it's handy to be able to quickly download the JSON of a list of questions (and their answers/comment) with all the API already, in-place by simply adding a cli.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new command-line interface for downloading Stack Exchange question/answer JSON (including related answers/comments) to disk, leveraging the existing API client.

Changes:

  • Introduces stackexchange-cli entrypoint that parses URLs, fetches data, and writes data.json under an output directory.
  • Adds getQuestionAnswers() to the API client to retrieve answers for a specific question.
  • Registers the new CLI binary in package.json.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.

File Description
src/cli.ts New CLI implementation: argument parsing, URL parsing, fetch orchestration, skip/resume heuristics, file output, and help text.
src/api/stackexchange.ts Adds an API client helper for fetching answers for a given question ID.
package.json Exposes stackexchange-cli as an additional package binary.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/cli.ts
Comment on lines +14 to +18
// Load environment variables from .env file (if it exists)
// This will not override existing environment variables
const envPath = join(dirname(__dirname), '.env');
dotenv.config({ path: envPath });

Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CLI loads .env from join(dirname(__dirname), '.env'), which points at the package directory once installed (e.g. node_modules/<pkg>/.env) rather than the user’s current working directory. This will make local .env credentials in the invoking project invisible. Prefer dotenv.config() (default: process.cwd()) or try process.cwd()/.env first and only fall back to the package root if needed.

Suggested change
// Load environment variables from .env file (if it exists)
// This will not override existing environment variables
const envPath = join(dirname(__dirname), '.env');
dotenv.config({ path: envPath });
// Load environment variables from the invoking directory first.
// Fall back to the package root .env for local development if needed.
// This will not override existing environment variables.
const fallbackEnvPath = join(dirname(__dirname), '.env');
const envResult = dotenv.config();
if (!envResult.parsed) {
dotenv.config({ path: fallbackEnvPath });
}

Copilot uses AI. Check for mistakes.
Comment thread src/cli.ts Outdated
Comment thread src/cli.ts
Comment on lines +47 to +56
interface FetchedData {
url: string;
site: string;
type: 'question' | 'answer';
id: number;
data: QuestionResponse | AnswerResponse | null;
answers?: AnswerResponse[];
comments?: any[];
error?: string;
timestamp: string;
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments?: any[] loses the strong typing already defined in src/api/interfaces.ts (CommentResponse). Using CommentResponse[] here (and updating imports) will prevent accidental shape mismatches when reading/writing data.json.

Copilot uses AI. Check for mistakes.
Comment thread src/cli.ts
Comment on lines +81 to +89
const hostname = urlObj.hostname;

// Extract site from hostname (e.g., stackoverflow.com -> stackoverflow)
const siteParts = hostname.split('.');
let site = siteParts[0];

// Handle special cases
if (site === 'www') {
site = siteParts[1];
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Site extraction via hostname.split('.') and taking the first segment breaks for valid Stack Exchange hosts like meta.stackoverflow.com (API site should be meta.stackoverflow) and localized Stack Overflow sites like ru.stackoverflow.com (API site ru.stackoverflow). Consider mapping *.stackoverflow.com to <subdomain>.stackoverflow and handling meta.<site>.com as meta.<site> (and still stripping leading www).

Suggested change
const hostname = urlObj.hostname;
// Extract site from hostname (e.g., stackoverflow.com -> stackoverflow)
const siteParts = hostname.split('.');
let site = siteParts[0];
// Handle special cases
if (site === 'www') {
site = siteParts[1];
const hostname = urlObj.hostname.replace(/^www\./, '');
const siteParts = hostname.split('.');
let site: string;
// Extract Stack Exchange API site from hostname
// Examples:
// - stackoverflow.com -> stackoverflow
// - ru.stackoverflow.com -> ru.stackoverflow
// - meta.stackoverflow.com -> meta.stackoverflow
// - meta.serverfault.com -> meta.serverfault
if (siteParts.length >= 2 && siteParts[siteParts.length - 2] === 'stackoverflow' && siteParts[siteParts.length - 1] === 'com') {
site = siteParts.length === 2 ? 'stackoverflow' : `${siteParts.slice(0, -1).join('.')}`;
} else if (siteParts.length >= 3 && siteParts[0] === 'meta' && siteParts[siteParts.length - 1] === 'com') {
site = `meta.${siteParts[1]}`;
} else {
site = siteParts[0];

Copilot uses AI. Check for mistakes.
Comment thread src/cli.ts Outdated
Comment thread src/cli.ts
Comment on lines +254 to +260
// For questions, check if we have the question body and at least one answer OR one comment
if (urlInfo.type === 'question') {
const hasQuestionBody = !!(data.data && 'body' in data.data && data.data.body);
const hasAnswers = !!(data.answers && data.answers.length > 0);
const hasComments = !!(data.comments && data.comments.length > 0);
return hasQuestionBody && (hasAnswers || hasComments);
}
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isDataComplete treats a question as incomplete unless it has at least one answer OR one comment. This will cause the CLI to re-fetch forever for legitimate questions that simply have no answers/comments (or for questions where comments are disabled), even if the question body was fetched successfully. Consider treating “complete” as “question fetched successfully and the answers/comments fetch was attempted (even if it returned an empty array)”, or at least allow 0 answers/comments as complete when the API indicates answer_count is 0.

Copilot uses AI. Check for mistakes.
Comment thread src/cli.ts
Comment on lines +274 to +298
/**
* Setup authentication
* Authentication is automatic if credentials are found in environment variables
*/
private async setupAuth(useAuth: boolean): Promise<void> {
const apiKey = process.env.STACKEXCHANGE_API_KEY;
const accessToken = process.env.STACKEXCHANGE_ACCESS_TOKEN;

// If no credentials found, skip authentication
if (!apiKey && !accessToken) {
if (useAuth) {
console.log('⚠️ No authentication credentials found. Using unauthenticated API (limited to 300 requests/day)');
}
return;
}

// Set auth on API client
if (accessToken) {
this.apiClient.setAuth({ access_token: accessToken, key: apiKey });
console.log('🔑 Using access token for authentication');
} else if (apiKey) {
this.apiClient.setAuth({ key: apiKey });
console.log('🔑 Using API key for authentication');
}
}
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -a/--auth flag is documented as enabling authenticated requests, but setupAuth() applies credentials whenever they exist, regardless of useAuth (the flag only changes whether a warning is printed when credentials are missing). Either (1) make auth truly opt-in (only call setAuth when useAuth is true), or (2) update the help text/flag name to reflect the current behavior (e.g. “--auth-warn” / “--no-auth”).

Copilot uses AI. Check for mistakes.
Comment thread src/cli.ts
Resume:
The CLI automatically skips URLs that have already been fetched successfully.
A URL is considered complete if the data file exists with valid content:
- For questions: must have question body and at least one answer
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Help text says a URL is complete when “For questions: must have question body and at least one answer”, but isDataComplete() currently accepts “at least one answer OR one comment”. Update the help text (or the completeness logic) so users aren’t surprised by what gets skipped/re-fetched.

Suggested change
- For questions: must have question body and at least one answer
- For questions: must have question body and at least one answer or comment

Copilot uses AI. Check for mistakes.
drzraf and others added 2 commits April 8, 2026 15:55
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants