于是立马检查从服务器读取的代码:
/** * 返回该链接地址的html数据 * * @param urlStr * @return * @throws CommonException */ public static String doGet(String urlStr) throws CommonException { StringBuffer sb = new StringBuffer(); try { URL url = new URL(urlStr); HttpURLConnection conn = (HttpURLConnection) url.openConnection(); conn.setRequestMethod(”GET”); conn.setConnectTimeout(5000); conn.setDoInput(true); conn.setDoOutput(true); if (conn.getResponseCode() == 200) { InputStream is = conn.getInputStream(); int len = 0; byte[] buf = new byte[1024]; while ((len = is.read(buf)) != -1) { sb.append(new String(buf, 0, len, “UTF-8”)); } is.close(); } else { throw new CommonException(“访问网络失败!”); } } catch (Exception e) { throw new CommonException(“访问网络失败!”); } return sb.toString(); }
发现可能是由于我采用字节流从网络读取数据,且每次读取1024个字节,读取完成后能后强制转化为字符串,又因为使用编码为UTF-8,UTF-8是一种变长码(英文1个字节,中文两个字节),所以1024可能会造成刚好截取了某个汉字的一半(前一个字节),然后转化为字符串时造成乱码。唯一不理解的在java环境下,使用控制台打印出是没有乱码的。如果你有不同的理解欢迎留言探讨。
/** * 返回该链接地址的html数据 * * @param urlStr * @return * @throws CommonException */ public static String doGet(String urlStr) throws CommonException { StringBuffer sb = new StringBuffer(); try { URL url = new URL(urlStr); HttpURLConnection conn = (HttpURLConnection) url.openConnection(); conn.setRequestMethod(”GET”); conn.setConnectTimeout(5000); conn.setDoInput(true); conn.setDoOutput(true); if (conn.getResponseCode() == 200) { InputStream is = conn.getInputStream(); InputStreamReader isr = new InputStreamReader(is,“UTF-8”); int len = 0; char[] buf = new char[1024]; while ((len = isr.read(buf)) != -1) { sb.append(new String(buf, 0, len)); } is.close(); isr.close(); } else { throw new CommonException(“访问网络失败!”); } } catch (Exception e) { throw new CommonException(“访问网络失败!”); } return sb.toString(); }
然而问题依旧有可能未解决。
终极方法
读取数据流的时候不要设置大小,直接读取每一行数据
public static String doGET(final String strUrl, JSONObject params) throws Exception { HttpURLConnection conn = null; String rs = ""; try { StringBuilder sb = new StringBuilder(); String strUrl_ = strUrl + ("?" + urlencode(params)); URL url = new URL(strUrl_); conn = (HttpURLConnection) url.openConnection(); conn.setRequestMethod("GET"); conn.setRequestProperty("User-agent", userAgent); conn.setUseCaches(false); conn.setChunkedStreamingMode(0); conn.setConnectTimeout(DEF_CONN_TIMEOUT); conn.connect(); InputStream is = conn.getInputStream(); String strRead = null; byte[] chars = new byte[1024]; //在 InputStreamDemo中用字节 int lenth = 0; if (is != null) { String line; BufferedReader reader = new BufferedReader(new InputStreamReader(is, DEF_CHATSET)); while ((line = reader.readLine()) != null) { isMessyCode(line); sb.append(line); } } rs = sb.toString(); is.close(); } catch (Exception e) { // TODO: handle exception e.printStackTrace(); } finally { if (conn != null) { conn.disconnect(); } } return rs; }
本文由 admin 创作,采用 知识共享署名4.0
国际许可协议进行许可
本站文章除注明转载/出处外,均为本站原创或翻译,转载前请务必署名
最后编辑时间为:2022-04-07 21:50:48