人気ブログランキング | 話題のタグを見る

System.IO モジュール(10) readFile, writeFile, ...

System.IO モジュールの Special cases セクションの関数は、readFile, writeFile, appendFile だ。簡単なプログラムを作るときは、これら3つの命令で十分だ。それについては、このブログの「ファイルの読み書き」の記事で述べた。

ここでは、視点を変えてこれらのソースを見てみよう。Haskell Wiki の System.IO モジュールのページの関数の説明の左肩のリンクをクリックすると、その関数のソースを読むことができる。

readFile のソースは次のようになっている。
readFile :: FilePath -> IO String
readFile name = openFile name ReadMode >>= hGetContents

やけに簡単だ。ということは、ほとんどの処理は、hGetContents が行っていることになる。そこで、hGetContents のソースも見てみた。

-- hGetContents

-- hGetContents on a DuplexHandle only affects the read side: you can
-- carry on writing to it afterwards.

-- | Computation 'hGetContents' @hdl@ returns the list of characters
-- corresponding to the unread portion of the channel or file managed
-- by @hdl@, which is put into an intermediate state, /semi-closed/.
-- In this state, @hdl@ is effectively closed,
-- but items are read from @hdl@ on demand and accumulated in a special
-- list returned by 'hGetContents' @hdl@.
--
-- Any operation that fails because a handle is closed,
-- also fails if a handle is semi-closed.  The only exception is 'hClose'.
-- A semi-closed handle becomes closed:
--
--  * if 'hClose' is applied to it;
--
--  * if an I\/O error occurs when reading an item from the handle;
--
--  * or once the entire contents of the handle has been read.
--
-- Once a semi-closed handle becomes closed, the contents of the
-- associated list becomes fixed.  The contents of this final list is
-- only partially specified: it will contain at least all the items of
-- the stream that were evaluated prior to the handle becoming closed.
--
-- Any I\/O errors encountered while a handle is semi-closed are simply
-- discarded.
--
-- This operation may fail with:
--
--  * 'isEOFError' if the end of file has been reached.

hGetContents :: Handle -> IO String
hGetContents handle =
   wantReadableHandle "hGetContents" handle $ \handle_ -> do
      xs <- lazyRead handle
      return (handle_{ haType=SemiClosedHandle}, xs )

-- Note that someone may close the semi-closed handle (or change its
-- buffering), so each time these lazy read functions are pulled on,
-- they have to check whether the handle has indeed been closed.

lazyRead :: Handle -> IO String
lazyRead handle =
   unsafeInterleaveIO $
        withHandle "hGetContents" handle $ \ handle_ -> do
        case haType handle_ of
          ClosedHandle     -> return (handle_, "")
          SemiClosedHandle -> lazyReadBuffered handle handle_
          _ -> ioException
                  (IOError (Just handle) IllegalOperation "hGetContents"
                        "illegal handle type" Nothing Nothing)

lazyReadBuffered :: Handle -> Handle__ -> IO (Handle__, [Char])
lazyReadBuffered h handle_@Handle__{..} = do
   buf <- readIORef haCharBuffer
   catch
        (do
            buf'@Buffer{..} <- getSomeCharacters handle_ buf
            lazy_rest <- lazyRead h
            (s,r) <- if haInputNL == CRLF
                         then unpack_nl bufRaw bufL bufR lazy_rest
                         else do s <- unpack bufRaw bufL bufR lazy_rest
                                 return (s,bufR)
            writeIORef haCharBuffer (bufferAdjustL r buf')
            return (handle_, s)
        )
        (\e -> do (handle_', _) <- hClose_help handle_
                  debugIO ("hGetContents caught: " ++ show e)
                  -- We might have a \r cached in CRLF mode.  So we
                  -- need to check for that and return it:
                  let r = if isEOFError e
                             then if not (isEmptyBuffer buf)
                                     then "\r"
                                     else ""
                             else
                                  throw (augmentIOError e "hGetContents" h)

                  return (handle_', r)
        )

-- ensure we have some characters in the buffer
getSomeCharacters :: Handle__ -> CharBuffer -> IO CharBuffer
getSomeCharacters handle_@Handle__{..} buf@Buffer{..} =
  case bufferElems buf of

    -- buffer empty: read some more
    0 -> readTextDevice handle_ buf

    -- if the buffer has a single '\r' in it and we're doing newline
    -- translation: read some more
    1 | haInputNL == CRLF -> do
      (c,_) <- readCharBuf bufRaw bufL
      if c == '\r'
         then do -- shuffle the '\r' to the beginning.  This is only safe
                 -- if we're about to call readTextDevice, otherwise it
                 -- would mess up flushCharBuffer.
                 -- See [note Buffer Flushing], GHC.IO.Handle.Types
                 _ <- writeCharBuf bufRaw 0 '\r'
                 let buf' = buf{ bufL=0, bufR=1 }
                 readTextDevice handle_ buf'
         else do
                 return buf

    -- buffer has some chars in it already: just return it
    _otherwise ->
      return buf

ちょっと歯が立たないが、実用的な Haskell プログラムの雰囲気がわかる。

冒頭の説明を読むと、hGetContents でファイルが読み出される場合、そのファイルハンドルはセミ・クローズドの状態になって、ファイルはいったんクローズされるが、読み出しの追加要求があったときには、そのハンドルで読み出しを続けることができる。ファイルがEOFになると、ファイルハンドルは完全にクローズされる。遅延評価に対応する設計になっている。

しかし、複雑なプログラムのおかげで、使う方は、単にファイルの内容を読み出すというイメージで使える。

ちなみに、writeFile, appendFile のソースは次のようになる。

writeFile :: FilePath -> String -> IO ()
writeFile f txt = withFile f WriteMode (\ hdl -> hPutStr hdl txt)

appendFile :: FilePath -> String -> IO ()
appendFile f txt = withFile f AppendMode (\ hdl -> hPutStr hdl txt)
by tnomura9 | 2011-09-15 23:05 | Haskell | Comments(0)
<< System.IO モジュール... System.IO モジュール... >>